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(57) Abstract 



Disclosed is a nucleotide sequence encoding an effective portion of a class A starch branching enzyme (SBE) obtamable from potato 
plants, or a iimctional equivalent thereof, together with, inter alia, a corresponding polypeptide, a method of altering the characteristics of 
a plant, a plant having altered characteristics; and starch, particularly starch obtained from a potato plant, having novel properties. 



FOR THE PURPOSES OF INFORMATION ONLY 

Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AM 


Annenia 


GB 


United Kingdom 


MW 


Malawi 


AT 


Austria 


GE 


Geor;gia 


MX 


Mexico 


AU 


Australia 


GN 


Guinea 


NE 


Niger 


BB 


Barbados 


GR 


Greece 


NL 


Netherlands 


BE 


Belghun 


HU 


Hungary 


NO 


Norway 




Buricina Paso 


IE 


Ireland 


NZ 


New Zealand 


BG 


Bulgazia 


IT 


fialy 


PL 


Poland 


BJ 


Bcmn 


JP 


Japan 


PT 


Poitugal 


BR 


Brazil 


KE 


Kenya 


RO 


Rotwitii^3 


BY 


Belarus 


KG 


Kyigystan 


RU 


Russian Ficdeistion 


CA 


Caiutda 


KP 


Democratic People'a Republic 


SD 


Sodan 


CF 


Central Afncan Republic 




of Korea 


SE 


Sweden 


CG 


Congo 


KR 


Republic of Korea 


SG 


Singapore 


CH 


Switzerland 


KZ 


Kazakhstan 


SI 


Slovenia 


a 


C6tc d'tvoiit 


U 


Liechtenstein 


SK 


Slovakia 


CM 


Cameroon 


IK 


Sri Lanica 


SN 


Senega) 


CN 


China 


LR 


Liberia 


sz 


Swaziland 


CS 


Czecboslovalda 


LT 


Lithuania 


TO 


Chad 


CZ 


Ciech Republic 


LU 


LuxemboQig 


TG 


Togo 


DE 


GctnuBiy 


LV 


Ltfvia 


TJ 


Tajikistan 


DK 


Denmark 


MC 


Ktonoo 


TT 


Trinidad and Tobago 
Ukraine 


EE 


Estonia 


MD 


Rqmblic of Moldova 


UA 


ES 


Spain 


MG 


Madagascar 


UG 


Uganda 


n 


Finland 


ML 


Mali 


US 


United Slates of America 


FR 


Fkanoe 


MN 


MongoUa 


vz 


Uzbeldscan 


GA 


Gabon 


MR 


Mauritanis 


VN 


Viet Nam 



wo 96/34968 



PCT/GB96/01075 



1 

Title: Improveme nts in or Relating to Plant Starch Composition 
Field of the Invention 

This invention relates to novel nucleotide sequences, polypeptides encoded thereby, vectors 
and host cells and host organisms comprising one or more of the novel sequences, and to 
a method of altering one or more characteristics of an organism. The mvention al;so 
relates to starch having novel properties and to uses thereof. 

Background of the Invention 

Starch is the major form of carbon reserve in plants, constituting 50% or more of the dry 
weight of many storage organs - e.g. tubers, seeds of cereals. Starch is used in numerous 
food and indusnial applications. In many cases, however, it is necessary to modify the 
native starches, via chemical or physical means, in order to produce distinct properties to 
suit particular applications. It would be highly desirable to be able to produce starches 
with the required properties direcdy in the plant, thereby removing the need for additional 
modification. To achieve this via genetic engmeering requires knowledge of the metabolic 
pathway of starch biosynthesis ! This includes characterisation of genes and encoded gene 
products which catalyse the synthesis of starch. Knowledge about the regulation of starch 
biosynthesis raises the possibility of "re-programming" biosynthetic pathways to create 
starches with novel properties that could have new commercial applications. 

The commercially useful properties of starch derive from the ability of the native granular 
form to swell and absorb water upon suitable treatment. Usually heat is required to cause 
granules to swell in a process known as gelatinisation, which has been defined (W A 
Atwell et al. Cereal Foods World 33, 306-311, 1988) as "... the collapse (disruption) of 
molecular orders within the starch granule manifested in irreversible changes in properties 
such as granular swelling, native crystallite melting, loss of birefringence, and starch 
solubilisation. The point of initial gelatinisation and the range over which it occurs is 
governed by starch concentration, method of observation, granule type, ami heterogeneities 
within the granule population uruler observation'*. A number of techniques are available 
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for the determination of gelatinisation as induced by heating, a convenient and accurate 
method being differential scanning calorimeiry, which detects the temperature range and 
enthalpy associated with the collapse of molecular orders within the granule. To obtain 
accurate and meaningful results, the peak and/or onset temperamre of the endoiherm 
observed by differential scanning calorimetry is usually determined. 

The consequence of the collapse of molecular orders within starch granules is that die 
granules are capable of taking up water in a process known as pasting, which has been 
defined (W A Atwell et aL Cereal Foods World 33, 306-311, 1988) as "... the 
phenomenon following geUmniscaion in the dissolution of starch. It involves granular 
swelling, exudation of molecular components from the granule, and eventually, total 
disruption of the granules'". The best method of evaluating pasting properties is 
considered to be the viscoamylograph (Atwell et al, 1988 cited above) in which the 
viscosity of a stirred starch suspension is monitored under a defined time/temperature 
regime. A typical viscoamylograph profile for potato starch shows an initial rise in 
viscosity, which is considered to be due to granule swelling. In addition to the overall 
shape of the viscosity response in a viscoamylograph, a convenient quantitative measure 
is the temperamre of initial viscosity development (onset). Figure 1 shows such a typical 
viscosity profile for potato starch, during and after cooking, and includes stages A-D 
which correspond to viscosity onset (A), maximum viscosity (B), complete dispersion (C) 
and reassociation of molecules (or retrogradation, D). In the figure, the dotted line 
represents viscosity (in stirring number units) of a 10% w/w starch suspension and the 
unbroken line shows the temperature in degrees centigrade. At a certain point, defined 
by the viscosity peak, granule swelling is so extensive that the resulting highly expanded 
strucmres are susceptible to mechanically-induced fragmentation under the stirring 
conditions used. With mcreased heating and holding at 95**C, further reduction in 
viscosity is observed due to increased fragmentation of swollen granules. This general 
profile has previously always been found for native potato starch. 

After heating starches in water to 95 ""C and holding at that temperature (for typically 15 
minutes), subsequent cooling to SCC results in an increase in viscosity due to the process 
of retrogradation or set-back. Retrogradation (or set-back) is defined (Atwell et al., 1988 
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cited above) as "...a process which occurs when the molecules comprising gelatinised 
starch begin to reassociate in an ordered structure...''. At 50**C, it is primarily the 
amylose component which reassociates, as indicated by die increase in viscoamylograph 
viscosity for starch from normal maize (21.6% amylose) compared with starch from waxy 
maize (1.1% amylose) as shown in Figure 2. Figure 2 is a viscoamylograph of 10%w/w 
starch suspensions from waxy maize (solid line), conventional maize (dots and dashes), 
high amylose variety (hylon 5, dotted line) and a very high amylose variety (hylon 7, 
crosses). The temperature profile is also shown by a solid line, as in Figure 1. The 
extent of viscosity increase in the viscoamylograph on cooling and holdmg at 50**C 
depends on the amount of amylose which is able to reassociate due to its exudation from 
starch granules during the gelatinisation and pasting processes, A characteristic of 
amylose-rich starches from maize plants is that very little amylose is exuded from granules 
by gelatinisation and pastmg up to 95 'C, probably due to the restricted swelling of the 
granules. This is illustrated in Figure 2 which shows low viscosities for a high amylose 
(44.9%) starch (Hylon 5) from maize during gelatinisation and pasting at 95*'C and little 
increase in viscosity on coolmg and holding at 50*C. This effect is more extreme for a 
higher amylose content (58%, as in Hylon 7), which shows even lower viscosities in the 
viscoamylograph test (Figure 2). For commercially-available high amylose starches 
(currently available from maize plants, such as those described above), processmg at 
greater than lOO'^C is usually necessary in order to generate the benefits of high amylose 
contents with respect to increased rates and strengths of reassociation, but use of such high 
temperamres is energetically unfavourable and costly. Accordingly, there is an unmet 
need for starches of high amylose content which can be processed below 100°C and still 
show enhanced levels of reassociation, as indicated for example by viscoamylograph 
measurements. 

The properties of potato starch are useful in a variety of both food and non-food (paper, 
textiles, adhesives etc.) applications. However, for many applications, properties are not 
optimum and various chemical and physical modifications well known in the an are 
undertaken in order to improve useful properties. Two types of property manipulation 
which would be of use are: the controlled alteration of gelatinisation and pasting 
temperatures; and starches which suffer less granular fragmentation diuing pasting than 
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Currently the only ways of manipulating the gelatinisation and pasting temperatures of 
potato starch are by the inclusion of additives such as sugars, polyhydroxy compounds of 
salts (Evans & Haisman, Starke 34, 224-231. 1982) or by extensive physical or chemical 
pre-treatments (e.g. Stute, Starke 44, 205-214, 1992). The reduction of granule 
fragmentation during pasting can be achieved either by extensive physical pretreatments 
(Stute, Starke 44, 205-214, 1992) or by chemical cross-linking. Such processes are 
inconvenient and inefficient. It is therefore desirable to obtain plants which produce starch 
which intrinsically possesses such advantageous properties. 

Starch consists of two main polysaccharides, amylose and amylopectin. Amylose is a 
generally linear polymer containing a-1,4 linked glucose units, while amylopectin is a 
highly branched polymer consisting of a a-L4 linked glucan backbone with a-1,6 linked 
glucan branches. In most plant storage reserves amylopectin constitutes about 75 % of the 
starch content. Amylopectin is synthesized by the concened action of soluble starch 
synthase and starch branching enzyme [a-1,4 glucan: a-1,4 glucan 6-gIycosyltransferase, 
EC 2.4,1.18]. Starch branching enzyme (SEE) hydrolyses a-1,4 linkages and rejoins the 
cleaved glucan, via an a-1,6 linkage, to an acceptor chain to produce a branched strucuire. 
The physical propenies of starch are strongly affected by the relative abundance of 
amylose and amylopectin, and SHE is therefore a crucial enzyme in determining both the 
quantity and quality of starches produced in plant systems. 

In most plants smdied to date e.g. maize (Boyer & Preiss, 1978 Biochem. Biophys. Res. 
Comm. 80, 169-175). rice (Smyth, 1988 Plant Sci. 57, 1-8) and pea (Smith, Plania 775, 
270-279), two forms of SBE have been identified, each encoded by a separate gene. A 
recent review by Burton et aL, (1995 The Plant Journal 7, 3-15) has demonstrated that 
the two forms of SBE constitute distinct classes of the enzyme such that, in general, 
enzymes of the same class from different plants may exhibit greater similarity than 
enzymes of different classes from the same plant. In their review. Burton et aL termed 
the two respective enzyme families class "A" and class "B", and the reader is referred 
thereto (and to the references cited therein) for a detailed discussion of the distinctions 
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between the two classes. One general distinction of note would appear to be the presence, 
in class A SBE molecules, of a flexible N-terminal domain, which is not found in class 
B molecules. The distinctions noted by Bunon et al. are relied on herein to define class 
A and class B SBE molecules, which terms are to be interpreted accordingly. 

However in potato, only one isoform of the SBE molecule (belonging to class B) has thus 
far been reported and only one gene cloned (Blennow & Johansson, 1991 Phytochem. JO, 
437-444, and KoBmann et aL, 1991 MoL Gen. Genet. 230, 39-44). Further, published 
attempts to modify the properties of starch in potato plants (by preventing expression of 
the single known SBE) have generally not succeeded (e.g. MuUer-Rober & KoBmann 1994 
Plant Cell and Environment 77, 601-613). 

Summary of the Invention 

In a first aspect the invention provides a nucleotide sequence encoding an effective portion 
of a class A starch branching enzyme (SBE) obtainable from potato plants. 

Preferably the nucleotide sequence encodes a polypeptide comprising an effective portion 
of the amino acid sequence shown in Figure 5 (excluding the sequence MNKRIDL, which 
does not represent pan of the SBE molecule), or a functional equivalent thereof (which 
term is discussed below). The amino acid sequence shown in Figure 5 (Seq ID No. 15) 
includes a leader sequence which directs the polypeptide, when synthesised in potato cells, 
to the amyloplast. Those skilled in the an will recognise that the leader sequence is 
removed to produce a mamre enzyme and that the leader sequence is therefore not 
essential for enzyme activity. Accordmgly, an "effective portion" of the polypeptide is 
one which possesses sufficient SBE activity to complement the branching enzyme mutation 
in £. coli KV 832 cells (described below) and which is active when expressed in E. coli 
in the phosphorylation stimulation assay. An example of an incomplete polypeptide which 
nevertheless constitutes an "effective portion** is the mamre enzyme lacking the leader 
sequence. By analogy with the pea class A SBE sequence, the potato class A sequence 
shown in Figure 5 probably possesses a leader sequence of about 48 amino acid residues, 
such that the N terminal amino acid sequence is thought to commence around the glutamic 
acid residue (E) at position 49 (EKSSYN... etc.). Those skilled in the art will appreciate 
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that an effective ponion of the enzyme may well omit other parts of the sequence shown 
in the figure without substantial detrimental effect. For example, the C-terminal glutamic 
acid-rich region could be reduced in length, or possibly deleted entirely, without ♦ 
abolishing class A SBE activity. A comparison with other known SBE sequences, 
especially other class A SBE sequences (see for example. Burton et aL 1995 cited above), 
should indicate those portions which are highly conserved (and thus likely to be essential 
for activity) and those ponions which are less well conserved (and thus are more likely 
to tolerate sequence changes without substantial loss of enzyme activity). 

Conveniently the nucleotide sequence will comprise substantially nucleotides 289 to 2790 
of the DNA sequence (Seq ID No. 14) shown in Figure 5 (which nucleotides encode the 
mature enzyme) or a functional equivalent thereof, and may also include further 
nucleotides at the 5' or 3' end. For example, for ease of expression, the sequence will 
desirably also comprise an in-frame PCTG start codon, and may also encode a leader 
sequence. Thus, in one embodiment, the sequence further comprises nucleotides 143 to 
288 of the sequence shown in Figure 5. Other embodiments are nucleotides 228 to 28S5 
of the sequence labelled "psbe2con.seq*' in Figure 8, and nucleotides 57 to 2564 of the 
sequence shown in Figure 12 (preferably comprising an in-frame ATG start codon, such 
as the sequence of nucleotides 24 to 56 in the same Figure), or functional equivalents of 
the aforesaid sequences. 

The term "functional equivalent" as applied herein to nucleotide sequences is intended to 
encompass those sequences which differ in their nucleotide composition to that shown in 
Figure 5 but which, by virtue of the degeneracy of the genetic code, encode polypeptides 
having identical or substantially identical amino acid sequences. It is intended that the 
term should also apply to sequences which are sufficiently homologous to the sequence of 
die invention that they can hybridise to the complement thereof under stringent 
hybridisation conditions - such equivalents will preferably possess at least 85%, more 
preferably at least 90%, and most preferably at least 95% sequence homology with the 
sequence of the invention as exemplified by nucleotides 289 to 2790 of the DNA sequence 
shown in Figure 5. It will be apparent to those skilled in the art that the nucleotide 
sequence of the invention may also find useful application when present as an "antisense" 
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sequence. Accordingly, functionally equivalent sequences will also include those 
sequences which can hybridise, under stringent hybridisation conditions, to the sequence 
of the invention (rather than the complement thereof). Such "antisense" equivalents will 
preferably possess at least 85%, more preferably at least 90%. and most preferably 95% 
sequence homology with the complement of the sequence of the invention as exemplified 
by nucleotides 289 to 2790 of the DNA sequence shown in Figure 5. Panicular functional 
equivalents are shown, for example, in Figures 8 and 10 (if one disregards the various 
frameshift mutations noted therein). 

The invention also provides vectors, panicularly expression vectors, comprising the 
nucleotide sequence of the invention. The vector will typically comprise a promoter and 
one or more regulatory signals of the type well known to those skilled in the art. The 
invention also includes provision of cells transformed (which term encompasses 
transduction and transfection) with a vector comprising the nucleotide sequence of the 
invention. 

The invention further provides a class A SBE polypeptide, obtainable from potato plants. 
In particular the invention provides the polypeptide in substantially pure form, especially 
in a form free from other plant-derived (especially potato plant-derived) components, 
which can be readily accomplished by expression of the relevant nucleotide sequence in 
a suitable non-plant host (such as any one of the yeast strains routinely used for expression 
purposes, e.g. Pichia spp. or Saccharontyces spp). Typically the enzyme will substantiaUy 
comprise the sequence of amino acid residues 49 to 882 shown in Figure 5 (disregarding 
the sequence MNKRIDL, which is not part of the enzyme), or a functional equivalent 
thereof. The polypeptide of the invention may be used in a method of modifying starch 
in vitro, comprising treating starch under suitable conditions (e.g. appropriate temperature, 
pH, etc) with an effective amount of the polypeptide according to the invention. 

The term "functional equivalent**, as applied herein to amino acid sequences, is intended 
to encompass amino acid sequences substantially similar to that shown in Figure 5, such 
that the polypeptide possesses sufficient activity to complement the branching enzyme 
mutation in £. coli KV 832 cells (described below) and which is active in £. coli in the 
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phosphorylation stimulation assay. Typically such functionally equivalent amino acid 
sequences will preferably possess at least 85%, more preferably at least 90%, and most 
preferably at least 95% sequence identity with the amino acid sequence of the mature 
enzyme (i.e. minus leader sequence) shown in Figure 5. Those skilled in the art will 
appreciate that conservative substimtions may be made generally throughout the molecule 
without substantially affecting the activity of the enzyme. Moreover, some non- 
conservative substimtions may be tolerated, especially in the less highly conserved regions 
of the molecule. Such substimtions may be made, for example, to modify slightly the 
activity of the enzyme. The polypeptide may, if desired, include a leader sequence, such 
as that exemplified by residues 1 to 48 of the amino acid sequence shown in Figure 5, 
although other leader sequences and signal peptides and the like are known and may be 
mcluded. 

A portion of the nucleotide sequence of the invention has been introduced into a plant and 
found to affect the characteristics of the plant. In particular, introduction of the sequence 
of the mvention, operably linked in the antisense orientation to a suitable promoter, was 
found to reduce the amount of branched starch molecules in the plant. Additionally, it has 
recentiy been demonstrated in other experimental systems that "sense suppression" can 
also occur (i.e. expression of an introduced sequence operably linked in the sense 
orientation can interfere, by some unknown mechanism, with the expression of the native 
gene), as described by Matzke & Matzke (1995 Plant Physiol. 707, 679-685). Any one 
of the methods mentioned by Matzke & Matzke could, in theory, be used to affect the 
expression in a host of a homologous SBE gene. 

It is believed that antisense methods are mainly operable by the production of antisense 
mRNA which hybridises to the sense mRNA, preventing its translation into functional 
polypeptide, possibly by causing the hybrid RNA to be degraded (e.g. Sheehy et al., 1988 
PNAS 55, 8805-8809; Van der Krol et aL, Mol. Gen. Genet. 220, 204-212). Sense 
suppression also requires homology between the introduced sequence and the target gene, 
but the exact mechanism is unclear. It is apparent however that, in relation to both 
antisense and sense suppression, neither a full length nucleotide sequence, nor a "native" 
sequence is essential. Preferably the "effective ponion" used in the method will comprise 
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at least one third of the full length sequence, but by simple trial and error other fragments 
(smaller or larger) may be found which are functional in altering the characteristics of the 
plant. 

Thus, in a fiirdier aspect the invention provides a method of altering the characteristics of 
a plant, comprising introducing into the plant an effective ponion of the sequence of the 
invention operably linked to a suitable promoter active in the plant. Conveniently the 
sequence will be linked in the anti-sense orientation to the promoter. Preferably the plant 
is a potato plant. Conveniently, the characteristic altered relates to the starch content 
and/or starch composition of the plant (i.e. amount and/or type of starch present in the 
plant). Preferably the method of altering the characteristics of the plant will also comprise 
the introduction of one or more further sequences, in addition to an effective portion of 
the sequence of the invention. The introduced sequence of the invention and the one or 
more further sequences (which may be sense or aniisense sequences) may be operably 
linked to a single promoter (which would ensure both sequences were transcribed at 
essentially the same time), or may be operably linked to separate promoters (which may 
be necessary for optimal expression). Where separate promoters are employed they may 
be identical to each other or different. Suitable promoters are well known to those skilled 
in the art and include both constimtive and inducible types. Examples include the CaMV 
35S promoter (e.g. single or tandem repeat) and the patatin promoter. Advantageously 
the promoter will be tissue-specific. Desirably the promoter will cause expression of the 
operably linked sequence at substantial levels only in the tissue of the plant where starch 
synthesis and/or smrch storage mainly occurs. Thus, for example, where the sequence is 
introduced into a potato plant, the operably linked promoter njay be mber-specific, such 
as the patatin promoter. 

Desirably, for example, the method will also comprise the introduction of an effective 
portion of a sequence encoding a class B SBE, operably linked in the antisense orientation 
to a suitable promoter active in the plant. Desirably the further sequence will comprise 
an effective ponion of the sequence encoding the potato class B SBE molecule. 
Conveniently the further sequence will comprise an effective portion of the sequence 
described by Blennow & Johansson (1991 Phytochem. 30, 437-444) or that disclosed in 
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W092/11375. More preferably, the further sequence will comprise at least an effective 
ponion of the sequence disclosed in International Patent Application No. WO 95/26407. 
Use of antisense sequences against both class A and class B SBE in combination has now 
been foimd by the present inventors to result in the production of starch having very 
greatly altered propenies (see below). Those skilled in the art will appreciate the 
possibility that, if the plant already comprises a sense or antisense sequence which 
efficiently inhibits the class B SBE activity, introduction of a sense or antisense sequence 
to inhibit class A SBE activity (thereby producing a plant with inhibition of both class A 
and class B activity) might alter greatly the properties of the starch in the plant, without 
the need for introduction of one or more further sequences. Thus the sequence of the 
invention is conveniently introduced into plants akeady having low levels of class A 
and/or class B SBE activity, such that the inhibition resulting from the introduction of the 
sequence of the invention is likely to have a more pronoimced effect. 

The sequence of the invention, and the one or more further sequences if desired, can be 
introduced into the plant by any one of a number of well-known techniques (e.g. 
Agrobacterium-mediated transformation, or by "biolistic" methods). The sequences are 
likely to be most effective in inhibiting SBE activity in potato plants, but theoretically 
could be introduced into any plant. Desirable examples include pea, tomato, maize, 
wheat, rice, barley, sweet potato and cassava plants. Preferably the plant will comprise 
a namral gene encoding an SBE molecule which exhibits reasonable homology with the 
introduced nucleic acid sequence of the invention. 

In another aspect, the invention provides a plant cell, or a plant or the progeny thereof, 
which has been altered by the method defined above. The progeny of the altered plant 
may be obtained, for example, by vegetative propagation, or by crossmg the altered plant 
and reserving the seed so obtained. The invention also provides parts of the altered plant, 
such as storage organs. Conveniently, for example, the invention provides mbers 
comprising altered starch, said tubers being obtained from an altered plant or the progeny 
thereof. Potato mbers obtained from altered plants (or the progeny thereof) will be 
particularly useful materials in certain industrial applications and for the preparation and/or 
processing of foodsmffs and may be used, for example, to prepare low-fat waffles and 
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chips (amylose generally being used as a coating to prevent fat uptake), and to prepare 
mashed potato (especially "instant" mashed potato) having particular characteristics. 

In particular relation to potato plants, the invention provides a potato plant or part thereof 
which, in its wild type possesses an effective SBE A gene, but which plant has been 
altered such that there is no effective expression of an SBE A polypeptide within the cells 
of at least part of the plant. The plant may have been altered by the method defined 
above, or may have been selected by conventional breeding to be deleted for the class A 
SBE gene, presence or absence of which can be readily determined by screening samples 
of the plants with a nucleic acid probe or antibody specific for the potato class A gene or 
gene product respectively. 

The invention also provides starch extracted from a plant altered by the method defined 
above, or the progeny of such a plant, the starch havmg altered properties compared to 
starch extracted from equivalent, but unaltered, plants. The invention further provides a 
method of making altered starch, comprising altering a plant by the method defined above 
and extractmg therefrom starch having altered properties compared to starch extracted 
from equivalent, but unaltered, plants. Use of nucleotide sequences in accordance with 
the invention has allowed the present inventors to produce potato starches having a wide 
variety of novel properties. 

In particular the invention provides the following: a plant (especially a potato plant) altered 
by the method defined above, containing starch which, when extracted from the plant, has 
an elevated endotherm peak temperature as judged by DSC, compared to starch extracted 
from a similar, but unaltered, plant; a plant (especially a potato plant) altered by the 
method defined above, containing starch which, when extracted from the plant, has an 
elevated viscosity onset temperamre (conveniently elevated by 10 - 25°C) as judged by 
viscoamylograph compared to starch extracted from a similar, but unaltered, plant; a plant 
(especially a potato plant) altered by the method defined above, containing starch which, 
when extracted from the plant, has a decreased peak viscosity (conveniently decreased by 
240 - TOOSNUs) as judged by viscoamylograph compared to starch extracted from a 
similar, but unaltered, plant; a plant (especially a potato plant) altered by the method 
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defined above, containing starch which, when extracted from the plant, has an increased 
pasting viscosity (conveniently increased by 37 - 260SNUs) as judged by viscoamylograph 
compared to starch extracted from a similar, but unaltered, plant; a plant (especially a 
potato plant) altered by the method defined above* containing starch which, when extracted 
from the plant, has an increased set-back viscosity (conveniently increased by 224 - 313 
SNUs) as judged by viscoamylograph compared to starch extracted from a similar, but 
unaltered, plant; a plant (especially a potato plant) altered by the method defined above, 
containing starch which, when extracted from the plant, has a decreased set-back viscosity 
as judged by viscoamylograph compared to starch extracted from a similar, but unaltered, 
plant; and a plant (especially a potato plant) altered by the method defined above, 
containing starch which, when extracted from the plant, has an elevated amylose content 
as judged by iodometric assay (i.e. by the method of Morrison & Laignelet 1983, cited 
above) compared to starch extracted from a similar, but unaltered, plant. The invention 
also provides for starch obtainable or obtained from such plants as aforesaid. 

In particular the invention provides for starch which, as extracted from a potato plant by 
wet milling at ambient temperature, has one or more of the following properties, as judged 
by viscoamylograph analysis performed according to the conditions defined below: 
viscosity onset temperamre in the range 70-95°C (preferably 75-95°C); peak viscosity in 
the range 500 - 12 stirring number units; pasting viscosity in the range 214 - 434 stirring 
number units; set-back viscosity in the range 450 - 618 or 14 - 192 stirring number units; 
or displays no significant increase in viscosity during viscoamylograph. Peak, pasting and 
set-back viscosities are defined below. Viscosity onset temperamre is the temperature at 
which there is a sudden, marked increase in viscosity from baseline levels during 
viscoamylograph, and is a term well-known to those skilled in the an. 

In other particular embodiments, the invention provides starch which as extracted from a 
potato plant by wet milling at ambient temperamre has a peak viscosity in the range 200 - 
500 SNUs and a set-back viscosity in the range 275-618 SNUs as judged by 
viscoamylograph according to the protocol defined below; and starch which as extracted 
from a potato plant by wet milling at ambient temperamre has a viscosity which does not 
decrease between the start of the heating phase (step 2) and the start of the final holding 
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phase (step 5) and has a set-back viscosity of 303 SNUs or less as judged by 
viscoamylograph according to the protocol defined below. 

For the purposes of the present invention, viscoamylograph conditions are understood to 
pertain to analysis of a 10% (w/w) aqueous suspension of starch at atmospheric pressure, 
using a Newpon Scientific Rapid Visco Analyser with a heating profile of: holdmg at 5(fC 
for 2 minutes (step 1), heating from 50 to 95^C at a rate of 1.5°C per minute (step 2), 
holding at 95°C for 15 minutes (step 3), cooling from 95 to 50°C at a rate of 1.5°C per 
minute (step 4), and then holding at 50^C for 15 minutes (step 5). Peak viscosity may be 
defined for present purposes as the maximum viscosity attained during the heating phase 
(step 2) or the holding phase (step 3) of the viscoamylograph. Pasting viscosity may be 
defined as the viscosity attained by the starch suspensions at the end of the holding phase 
(step 3) of the viscoamylograph. Set-back viscosity may be defmed as the viscosity of the 
starch suspension at the end of step 5 of the viscoaihylograph. 

In yet another aspect the invention provides starch fix)m a potato plant having an apparent 
amylose content (% w/w) of at least 35%, as judged by iodometric assay according to the 
method described by Morrison & Laignelet (1983 J. Cereal Science i, 9-20). Preferably 
the starch will have an amylose content of at least 40%, more preferably at least 50%, and 
most preferably at least 66%. Starch obtained directly from a potato plant and having 
such properties has not hitherto been produced. Indeed, as a result of the present 
invention, it is now possible to generate in vivo potato starch which has some properties 
analogous to the very high amylose starches (e.g. Hylon 7) obtainable fi-om maize. 

Starches with high (at least 35%) amylose contents fmd commercial application as, 
amongst other reasons, the amylose component of starch reassociates more strongly and 
rapidly than the amylopectin component during retrogradation processes. This may result, 
for example, in pastes with higher viscosities, gels of greater cohesion, or fihns of greater 
strength for starches with high (at least 35%) compared with normal (less than 35%) 
amylose contents. Alternatively, starches may be obtained with very high amylose 
contents, such that the granule strucmre is substantially preserved during heating, resulting 
in starch suspensions which demonstrate substantially no increase in viscosity during 
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cooking (i.e. there is no significant viscosity increase during viscoamylograph conditions 
defined above). Such starches typically exhibit a viscosity increase of less than 10% 
(preferably less than 5%) during viscoamylograph under the conditions defined above. 

In commerce, these valuable properties are currently obtained from starches of high 
amylose content derived from maize plants. It would be of commercial value to have an 
alternative source of high amylose starches from potato as other characteristics such as 
granule size, organoleptic propenies and texmral qualities may distinguish application 
performances of high amylose starches from maize and potato plants. 

Thus high amylose starch obtained by the method of the present invention may find 
application in many different technological fields, which may be broadly categorised into 
two groups: food products and processing; and "Industrial" applications. Under the 
heading of food products, the novel starches of the present invention may find application 
as, for example, films, barriers, coatings or gelling agents. In general, high amylose 
content starches absorb less fat during flying than starches with low amylose content, thus 
the high amylose content starches of the invention may be advantageously used in 
preparing low fat fried products (e.g. potato chips, crisps and the like). The novel 
starches may also be employed with advantage in preparing confectionery and in granular 
and retrograded "resistant" starches. "Resistant" starch is starch which is resistant to 
digestion by a-amylase. As such, resistant starch is not digested by a-amylases present 
in the human small intestine, but passes into the colon where it exhibits properties similar 
to soluble and insoluble dietary fibre. Resistant starch is thus of great benefit in foodstuffs 
due to its low calorific value and its high dietary fibre content. Resistant starch is formed 
by the retrogradation (akin to recrystallization) of amylose from starch gels. Such 
retrogradation is inhibited by amylopectin. Accordingly, the high amylose starches of the 
present mveniion are excellent starting materials for the preparation of resistant starch. 
Suitable methods for the preparation of resistant starch are well-known to those skilled in 
the art and include, for example, those described in US 5,051,271 and US 5,281,276. 
Conveniently the resistant starches provided, by the present invention comprise at least 5% 
total dietary fibre, as judged by the method of Prosky et aL, (1985 J. Assoc. Off. Anal. 
Chem. 68, 677), mentioned in US 5,281, 276. 
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Under the heading of "Industriar* applications, the novel starches of the invention may be 
advantageously employed, for example, in corrugating adhesives, in biodegradable 
products such as loose fill packaging and foamed shapes, and in the production of glass 
fibers and textiles. 

Those skilled in the an will appreciate that the novel starches of the invention may, if 
desired, be subjected in vitro to conventional enzymatic, physical and/or chemical 
modification, such as cross-linking, introduction of hydrophobic groups (e.g. octenyl 
succinic acid, dodecyl succinic acid), or derivatization (e.g. by means of esterification or 
etherification). 

In yet another aspect the invention provides high (35% or more) amylose starches which 
generate paste viscosities greater than those obtained from high amylose starches from 
maize plants after processing at temperatures below lOO'C. This provides the advantage 
of more economical starch gelatinisation and pasting treatments through the use of lower 
processing temperatures than are currently required for high amylose starches from maize 
plants. 

The invention will now be further described by way of illustrative example and with 
reference to the drawings, of which: 

Figure 1 shows a typical viscoamylograph for a 10% w/w suspension of potato starch; 

Figure 2 shows vsicoamylographs for 10% suspensions of starch from various maize 
varieties; 

Figure 3 is a schematic representation of the cloning strategy used by the present 
inventors; 

Figure 4a shows the amino acid alignment of the C-terminal portion of starch branching 
enzyme isoforms from various sources: amino acid residues matching the consensus 
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Figure 4b shows the aligrimeni of DNA sequences of various starch branching enzyme 
isoforms which encode a conserved amino acid sequence; 

Figure 5 shows the DNA sequence (Seq ID No, 14) and predicted amino acid sequence 
(Seq ID No. 15) of a full length potato class A SBE cDNA clone obtained by PGR; 

Figure 6 shows a comparison of the most highly conserved pan of the amino acid 
sequences of potato class A (uppermost sequence) and class B (lowermost sequence) SBE 
molecules; 

Figure 7 shows a comparison of the amino acid sequence of the full length potato class A 
(uppermost sequence) and pea (lowermost sequence) class A SBE molecules; 

Figure 8 shows a DNA alignment of various full length potato class A SBE clones 
obtained by the inventors; 

Figure 9 shows the DNA sequence of a potato class A SBE clone determined by direct 
sequencing of PGR products, together with the predicted amino acid sequence; 

Figure 10 is a multiple DNA alignment of various full length potato SBE A clones 
obtained by the inventors; 

Figure 11 is a schematic illustration of the plasmid pSJ64; 

Figure 12 shows the DNA sequence and predicted amino acid sequence of the full length 
potato class A SBE clone as present in the plasmid pSJ90; and 

Figure 13 shows viscoamylographs for 10% w/w suspensions of starch from various 
transgenic potato plants made by the relevant method aspect of the invention. 
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Examples 
Example 1 

Cloning of Potato class A SBE 

The strategy for cloning the second form of starch branching enzyme from potato is shown 
in Figure 3. The small arrowheads represent primers used by the inventors in PGR and 
RACE protocols. The approximate size of the fragments isolated is indicated by the 
numerals on the right of the Figure. By way of explanation, a comparison of the amino 
acid sequences of several cloned plant starch branching enzymes (SBE) from maize (class 
A), pea (class A), maize (class B). rice (class B) and potato (class B), as well as human 
glycogen branching enzyme, allowed the inventors to identify a region in the 
carboxy-terminal one third of the protein which is ahnost completely conserved 
(GYLNFMGNEFGHPEWIDFPR) (Figure 4a). A multiple alignment of the DNA 
sequences (human, pea class A, potato class B, maize class B, maize class A and rice class 
B, respectively) corresponding to this region is shown in Figure 4b and was used to design 
an oligo which would potentially hybridize to all known plant starch branching enzymes: 
AATTT(C/T)ATGGGIAA(C/T)GA(A/G)TT(C/T)GG (Seq ID No. 20). 

Library PCR 

The initial isolation of a partial potato class A SBE cDNA clone was from an amplified 
potato mber cDNA library in the XZap vector (Stratagene). One half fxL of a potato 
cDNA library (titre 2,3 x lO^pfu/mL) was used as template in a 50 /xL reaction containing 
100 pmol of a 16 fold degenerate POTSBE primer and 25 pmol of a T7 primer (present 
in the XZap vector 3' to the cDNA sequences - see Figure 3), 100 fiM dNTPs, 2.5 U Taq 
polymerase and the buffer supplied with the Taq polymerase (Stratagene). All components 
except the enzyme were added to a 0.5 mL microcentrifuge tube, covered with mineral 
oil and incubated at 94 ^'C for 7 minutes and then held at 55*'C, while the Taq polymerase 
was added and mixed by pipetting. PCR was then performed by incubating for 1 min at 
94°C, 1 min at SS^'C and 3 minutes at 72°C, for 35 cycles. The PCR products were 
extracted with phenol/chloroform, ethanol precipitated and resuspended in TE pH 8.0 
before cloning into the T/A cloning vector pT7BlueR (Invitrogen). 
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Several fragments between 600. and 1300 bp were amplified. These were isolated from 
an agarose gel and cloned into the pT7BlueR T/A cloning vector. Restriction mapping of 
24 randomly selected clones showed that they belonged to several different groups (based 
on size and presence/absence of restriction sites). Initially four clones were chosen for 
sequencing. Of these four, two were found to correspond to the known potato class B 
SEE sequence, however the other two,, although homologous, differed significantly and 
were more similar to the pea class A SEE sequence, suggesting that they belonged to the 
class A family of branching enzymes (Burton et al., 1995 The Plant Journal, cited above). 
The latter two clones ( - 800bp) were sequenced fiiUy. They both contained at the 5* end 
the sequence corresponding to the degenerate oligonucleotide used in the PCR and had a 
predicted open reading frame of 192 amino acids. The deduced amino acid sequence was 
highly homologous to that of the pea class A SHE. 

The -800 bp PCR derived cDNA fragment (corresponding to nucleotides 2281 to 3076 
of the psbe2 con.seq sequence shown in Figure 8) was used as a probe to screen the potato 
tuber cDNA library. From one hundred and eighty thousand plaques, seven positives 
were obtained in the primary screen. PCR analysis showed that five of these clones were 
smaller than the original 800 bp cDNA clone, so these were not analysed further. The 
two other clones (designated 3.2.1 and 3.1.1) were approximately 1200 and 1500 bp in 
length respectively. These were sequenced from their 5* ends and the combined consensus 
sequence aligned with the sequence from the PCR generated clones. The cDNA clone 
3.2.1 was excised from the phage vector and plasmid DNA was prepared and the insert 
fully sequenced. Several attempts to obtain longer clones from the library were 
unsuccessful, therefore clones containing the 5' end of the full length gene were obtained 
using RACE (rapid amplification of cDNA ends). 

Rapid Amplification of cDNA ends (RACE) and PCR conditions 

RACE was performed essentially according to Frohman (1992 Amplifications 11-15). 
Two ^g of total RNA from manire potato mbers was heated to 65 ^C for 5 min and quick 
cooled on ice. The RNA was then reverse transcribed in a 20 /xL reaction for 1 hour at 
ST^'C using BRL's M-MLV reverse transcriptase and buffer with 1 mM DTT, 1 mM 
dNTPs, 1 U/^L RNAsin (Promega) and 500 pmol random hexamers (Pharmacia) as 
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primer. Excess primers were, removed on a Centricon 100 column and cDNA was 
recovered and precipitated with isopropanol. cDNA was A-tailed in a volume of 20 fd 
using 10 units terminal transferase (BRL), 200 fiM dATP for 10 min at 37°C, followed 
by 5 min at 65 "C. The reaction was then diluted to 0.5 ml with TE pH 8 and stored at 
4°C as the cDNA pool. cDNA clones were isolated by PCR amplification using the 
primers R^,dT„, R, and POTSBE24. The PCR was performed in 50 /xL using a hot stan 
technique: 10 nL of the cDNA pool was heated to 94''C in water for 5 min with 25 pmol 
POTSBE24, 25 pmol R, and 2.5 pmol of R,R,dT,7 and cooled to 75°C. Five /tL of 10 
X PCR buffer (Stratagene), 200 dNTPs and 1 .25 units of Taq polymerase were added, 
the mixmre heated at 45'C for 2 min and 72*C for 40 min followed by 35 cycles of 94"C 
for 45 sec, 50'C for 25 sec, 72'C for 1.5 min and a final incubation at 72'C for 10 min. 
PCR products were separated by electrophoresis on 1% low meltmg agarose gels and the 
smear covering the range 600-800 bp fiagments was excised and used in a second PCR 
ampUficauon with 25 pmol of R, and POTSBE25 primers in a 50 /iL reaction (28 cycles 
of 94»C for 1 min, 50''C 1 min, 72»C 2 min). Products were purified by chloroform 
extraction and cloned into pT7 Blue. PCR was used to screen the colonies and the longest 
clones were sequenced. 

The first round of RACE only extended the length of the SBE sequence approximately 100 
bases, therefore a new A-tailed cDNA library was constructed using the class A SBE 
specific oligo POTSBE24 (10 pmol) in an attempt to recover longer RACE products. The 
first and second round PCR reactions were performed usmg new class A SBE primos 
(POTSBE 28 and 29 respectively) derived from the new sequence data. Conditions were 
as before except that the elongation step in the first PCR was for 3 min and the second 
PCR consisted of 28 cycles at 94 "C for 45 seconds, 55 "C for 25 sec and 72 "C for 1 
min 45 sec. 

Clones ranging in size from 400 bp to 1.4 kb were isolated and sequenced. The combined 
sequence of the longest RACE products and cDNA clones predicted a full length gene of 
about 3150 nucleotides, excluding the poly(A) tail (psbe 2con.seq in Fig. 8). 

As the sequence of the 5' half of the gene was compiled from the sequence of several 
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RACE products generated using Taq polymerase, it was possible that die compiled 
sequence did not represent diat of a single mRNA species and/or had nucleotide sequence 
changes. The 5* 1600 bases of the gene was dierefore re-isolated by PCR using Ultma, 
a thermostable DNA polymerase which, because it possesses a 3 '-5' exonuclease activity, 
has a lower error rate compared to Taq polymerase. Several PCR products were cloned 
and restriction mapped and found to differ in the number of Hind UL Ssp I, and EcoR I 
sites. These differences do not represent PCR anefacts as they were observed in clones 
obtained from independent PCR reactions (data not shown) and indicate that there are 
several forms of the class A SBE gene transcribed in potato tubers. 

In order to ensure diat the sequence of the full length cDNA clone was derived from a 
single mRNA species it was therefore necessary to PCR the entire gene in one piece. 
cDNA was prepared according to the RACE protocol except that the adaptor oligo 
RoRidTi7 (5 pmol) was used as a primer and after synthesis the reaction was diluted to 200 
fiL with TE pH 8 and stored at 4*'C. Two fiL of the cDNA was used in a PCR reaction 
of 50 fiL using 25 pmol of class A SBE specific primers PBERl and PBERT (see below), 
and thuty cycles of 94° for 1 min, 60°C for 1 min and 72°C for 3 min. If Taq 
polymerase was used the PCR products were cloned into pT7Blue whereas if Ultma 
polymerase was used the PCR products were purified by chloroform extraction, ethanol 
precipitation and kinased in a volume of 20 (and then cloned into pBSSK IIP which 
had been cut with EcoRV and dephosphorylated). At least four classes of cDNA were 
isolated, which again differed in the presence or absence of Hind HI, Ssp I and EcoR I 
sites. Three of these clones were sequenced fully, however one clone could not be 
isolated in sufficient quantity to sequence. 

The sequence of one of the clones (number 19) is shown in Figure 5. The first methionine 
(initiation) codon starts a short open reading frame (ORF) of 7 amino acids which is out 
of frame with the next predicted ORF of 882 amino acids which has a molecular mass 
(Mr) of approximately 100 Kd. Nucleotides 6-2996 correspond to SBE sequence - the rest 
of the sequence shown is vector derived. Figure 6 shows a comparison of the most highly 
conserved pan of the amino acid sequence of potato class A SBE (residues 180-871, top, 
row) and potato class B SBE (bottom row, residues 98-792); the middle row indicates the 
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degree of similarity, identical residues being denoted by the common letter, conservative 
changes by two dots and neutral changes by a single dot. Dashes indicate gaps introduced 
to optimise the alignment. The class A SBE protein has 44% identity over the entire 
length with potato class B SBE, and 56% identity therewith in the central conserved 
domain (Figure 6), as judged by the "Megalign " program (DNASTAR). However. Figure 
7 shows a comparison between potato class A SBE (top row, residues 1-873) and pea class 
A SBE (bottom row, residues 1-861), from which it can be observed that cloned potato 
gene is more homologous to the class A pea enzyme, where the identity is 70 % over 
nearly the entire length, and this mcreases to 83 % over the central conserved region 
(starting at IPPP at position '110). It is clear from this analysis that this cloned potato 
SBE gene belongs to the class A family of SBE genes. 

An £. coli culture, containing the plasmid pSJ78 (which directs the expression of a full 
length potato SBE Class A gene), has been deposited (on 3rd January 1996) under the 
terms of the Budapest Treaty at The National Collections of Industrial and Marine Bacteria 
Limited (23 St Machar Drive, Aberdeen, AB2 IRY, United Kingdom), under accession 
number NCMB 40781. Plasmid pSJ78 is equivalent to clone 19 described above. It 
represents a fiiU length SBE A cDNA blunt-end ligaied into the vector pBSSKHP. 

Polymorphism of class A SBE genes 

Sequence analysis of the other two full length class A SBE genes showed that they contain 
ftameshift mutations and are therefore unable to encode full length proteins and indeed 
they were unable to complement the branching enzyme deficiency in the KV832 mutant 
(described below). An alignment of the full length DNA sequences is shown in Figure 
8: "10con.seq" (Seq ID No. 12), "19con.seq" (Seq ID No. 14) and "llcon.seq" (Seq ID 
No. 13) represent the sequence of fall length clones 10, 19 and 11 obtained by PCR using 
the PBERl and PBERT primers (see below), whilst "psbe2con.seq" (Seq ID No. 18) 
represents the consensus sequence of the RACE clones and cDNA clone 3.2.1. Those 
nucleotides which differ from the overall consensus sequence (not shown) are shaded. 
Dashes indicate gaps introduced to optimise the alignment. Apan from the frameshift 
mutations these clones are highly homologous. It should be noted that the 5' sequence of 
psbe2con is longer because this is the longest RACE product and it also contains several 
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changes compared to the other clones. The upstream methionine codon is still present in 
this clone but the upstream ORF is shortened to just 3 amino acids and in addition there 
is a 10 base deletion in ±e 5* untranslated leader. 

The other significant area of variation is in the carboxy terminal region of the protein 
coding region. Closer examination of this area reveals a GAA trinucleotide repeat 
structure which varies in length between the four clones. These are typical characteristics 
of a microsatellite repeat region. The most divergent clone is #11 which has only one 
GAA triplet whereas clone 19 has eleven perfect repeats and the other two clones have 
five and seven GAA repeats. All of these deletions maintain the ORF but change the 
number of glutamic acid residues at the carboxy terminus of the protein. 

Most of the other differences between the clones are single base changes. It is quite 
possible that some of these are PGR errors. To address this question direct sequencing 
of PGR fragments amplified from first strand cDNA was perfonned. Figure 9 shows the 
DNA sequence, and predicted amino acid sequence, obtained by such du-ect sequencing. 
Certain restriction sites are also marked. Nucleotides which could not be unambiguously 
assigned are indicated using standard lUPAC notation and, where this uncertainty affects 
the predicted amino acid sequence, a question mark is used. Sequence at the extreme 5' 
and 3' ends of the gene could not be determined because of the heterogeneity observed in 
the different cloned genes in these regions (see previous paragraph). However this can 
be taken as dkect evidence that these differences are real and are not PGR or cloning 
anefacts. 

There is absolutely no evidence for the frameshift mutations in the PGR derived sequence 
and it would appear that these mutations are an artefact of the cloning process, resulting 
from negative selection pressure in £. coli. This is supponed by the fact that it proved 
extremely difficult to clone the full length PGR products intact as many large deletions 
were seen and the fall length clones obtained were all cloned in one orientation (away 
from the LacZ promoter), perhaps suggesting that expression of the gene is toxic to the 
cells. Difficulties of this namre may have been responsible, at least in part, for the 
previous failure of other researchers to obtain the present invention. 
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A comparison of all the full length sequences is shown in Figure 10. In addition to clones 
10, 1 1 and 19 are shown the sequences of a Bgl II - Xho I product cloned directly into the 
QE32 expression vector ("86CON.SEQ\ Seq ID No. 16) and the consensus sequence of 
the directly sequenced PCR products ("pcrsbe2con.seq'\ Seq ID No. 17). Those 
nucleotides which differ from the consensus sequence (not shown) are shaded. Dashes 
indicate gaps introduced to optimise the alignment. There are 1 1 nucleotide differences 
predicted to be present in the mRNA population, which are indicated by asterisks above 
and below the sequence. The other differences are probably PCR artefacts or possibly 
sequencing errors. 

Complementation of a branching enzyme deficient E. colt mutant 
To determine if the isolated SBE gene encodes an active protein i.e. one that has 
branching enzyme activity, a complementation test was performed in the E. coli strain 
KV832. This strain is unable to make bacterial glycogen as the gene for the glycogen 
branchmg enzyme has been deleted (Keil et aL, 1987 MoL Gen. Genet. 207, 294-301). 
When wild type cells are grown in the presence of glucose they synthesise glycogen (a 
highly branched glucose polymer) which stains a brown colour with iodine, whereas the 
KV832 cells make only a linear chain glucose polymer which stains blueish green with 
iodine. To determine if the cloned SBE gene could restore the ability of the KV832 cells 
to make a branched polymer, the clone pSJ90 (Seq ID No. 19) was used and constructed 
as below. The construct is a PCR-derived, substantially full length fragment (made using 
primers PEE 2B and PBE 2X, detailed below), which was cut with Bgl II and Xho I and 
cloned into the BamR I I Sail sites of the His-tag expression vector pQE32 (Qiagen). 
This clone, pSJ86, was sequenced and found to have a frameshift mutation of two bases 
in the 5' half of the gene. This frameshift was removed by digestion with Nsi I and SndR 
I and replaced with the corresponding fragment from a Taq-generated PCR clone to 
produce the plasmid pSJ90 (sequence shown in Figure 12; the first 10 amino acids are 
derived from the expression vector). The polypeptide encoded by pSJ90 would be 
predicted to correspond to amino acids 46-882 of the fiill SBE coding sequence. The 
construct pSJ90 was transformed into the branching enzyme deficient KV832 cells and 
transformants were grown on solid PYG medium (0.85% KH:P04, 1.1% KnHPO^, 0.6% 
yeast extract) containing 1.0% glucose. To test for complementation, a loop of cells was 
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scraped off and resuspended in 150/xl of water, to which was added 15/^1 Lugol's solution 
(2g KI and Ig L per 300ml water). It was found that the potato SBE fragment- 
transformed KV832 cells now stained a yellow-brown colour with iodine whereas control 
ceils containing only the pQE32 vector continued to stain blue-green. 

Expression of potato class A SBE in E. coU 

Single colonies of KV832, containing one of the plasmids pQE32, pAGCRl or pSJ90, 
were picked into SOml of 2xYT medium containing carbenicillin, kanamycin and 
streptomycin as appropriate (100, 50 and 25 mg/L, respectively) in a 250ml flask and 
grown for 5 hours, with shaking, at ZTC. IPTG was then added to a final concentration 
of ImM to induce expression and the flasks were further incubated overnight at 25^. 
The cells were harvested by centrifugation and resuspended in 50 mM sodium phosphate 
buffer (pH 8.0), contaming 300mM NaCl, Img/ml lysozyme and ImM PMSF and left on 
ice for 1 hour. The cell lysates were then sonicated (3 pulses of 10 seconds at 40% power 
using a microprobe) and cleared by centrifugation at I2,000g for 10 minutes at 4°C. 
Qeared lysates were concentrated approximately 10 fold in a Centricon™ 30 filtration 
unit. Duplicate lO^il samples of the resulting extract were assayed for SBE activity by the 
phosphorylation stimulation method, as described in International Patent Application No. 
PCT/GB95/00634.. In brief, the standard assay reaction mixmre (0.2ml) was 200mM 2- 
(N-morpholino) ethanesulphonic acid (MES) buffer pH6.5, containing lOOnCi of 
glucose-l-phosphate at 50mM, 0.05 mg rabbit phosphorylase A, and £. coli lysate. The 
reaction mixmre was incubated for 60 minutes at and the reaction terminated and 
glucan polymer precipitated by the addition of 1ml of 75% (v/v) methanol. 1% (w/v) 
potassium hydroxide, and then 0.1ml glycogen (lOmg/ml). The results are presented 
below: 



Construct 


SBE Activity (cpm) 1 


pQE32 (control) 


1,829 


pSJ90 (potato class A SBE) 


14,327 


pAGCRl (pea class A SBE) 


29,707 



The potato class A SBE activity is 7-8 fold above background levels. It was concluded 
therefore that the potato class A SBE gene was able to complement the BE mutation ui the 
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phosphorylation stimulation assay and that the cloned gene does indeed code for a protein 
with branching enzyme activity. 

Oligonucleotides 

The following synthetic oligonucleotides (Seq ID No.s 1-11 respectively) were used: 



RoR,dT,7 


AAGGATCCGTCGACATCGATAATACGACTCACTATAGGGA(T); 


Ro 


AAGGATCCGTCGACATC 


Ri 


GACATCGATAATACGAC 


POTSBE24 


CATCCAACCACCATCTCGCA 


POTSBE25 


TTGAGAGAAGATACCTAAGT 


POTSBE28 


ATGTTCAGTCCATCTAAAGT 


POTSBE29 


AGAACAACAATTCCTAGCTC 


PEER 1 


GGGGCCTTGAACTCAGCAAT 


PBERT 


CGTCCCAGCATTCGACATAA 


PBE2B 


CTTGGATCCTTGAACTCAGCAATTTG 


PBE2X 


TAACTCGAGCAACGCGATCACAAGTTCGT 



Example 2 

Production of Transgenic Plants 

Construction of plant transformation vectors with antisense starch branching enzyme 
genes 

A 1200 bp Sac I - Xho I fragment, encoding approximately the -COOH half of the potato 
class A SBE (isolated from the rescued XZap clone 3.2.1), was cloned into the Sac I - Sal 
I sites of the plant transformation vector pSJ29 to create plasmid pSJ64, which is 
illustrated schematically in Figure 11. In the figure, the black line represents the DNA 
sequence. The broken line represents the bacterial plasmid backbone (containing the 
origin of replication and bacterial selection marker), which is not shown in full. The filled 
triangles on the line denote the T-DNA borders (RB = right border, LB = left border). 
Relevant restriction sites are shown above the black line, with the approximate distances 
(in kilobases) between the sites (marked by an asterisk) given by the numerals below the 
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line. The thinnest arrows indicate polyadenyiation signals (pAnos = nopaline synthase, 
pAg7 = Agrobacterium gene 7), the arrows intermediate in thickness denote protein 
coding regions (SBE II = potato class A SBE, HYG = hygromycin resistance gene) and 
the thickest arrows represent promoter regions (P-2x35 = double CaMV 35S promoter, 
Pnos = nopaline synthase promoter). Thus pSJ64 contained the class A SBE gene 
fragment in an antisense orientation between the 2X 35S CaMV promoter and the nopaline 
synthase polyadenyiation signal. 

For information, pSJ29 is a derivative of the binary vector pGPTV-HYG (Becker et al., 
1992 Plant Molecular Biology 20, 1195-1197) modified as follows: an approxunately 750 
bp (Sac I, T4 DNA polymerase blunted - Sal I) fragment of pJIT60 (Guerineau et aL, 
1992 Plant MoL Biol. 18, 815-818) containing the duplicated cauliflower mosaic vims 
(CaMV) 35S promoter (Cabb-JI strain, equivalent to nucleotides 7040 to 7376 duplicated 
upstream of 7040 to 7433, Frank et aL , 1980 Cell 2i, 285-294) was cloned into the HwA 
m (Klenow polymerase repaired) - Sal I sites of pGPTV-HYG to create pSJ29. 

Plant transformation 

Transformation was conducted on two types of potato plant explants; either wild type 
untransformed minimbers (in order to give single transformants containing the class A 
antisense construct alone) or minimbers from three tissue culmre lines (which gave rise 
to plants #12, #15, #17 and #18 indicated in Table 1) which had already been successfully 
transformed with the class B (SBE I) antisense construct containing the tandem 35S 
promoter (so as to obtain double transformant plants, containing antisense sequences for 
both the class A and class B enzymes). 

Details of the method of Agrobacterium transformation, and of the growth of transformed 
plants, are described in International Patent Application No. WO 95/26407, except that 
the medium used contained 3 % sucrose (not 1 %) until the fmal transfer and that the initial 
incubation with Agrobacterium (su^ 3850) was performed in darkness. Transformants 
containing the class A antisense sequence were selected by growth in medium containing 
15mg/L hygromycin (the class A antisense construct comprising the HYG gene, i.e. 
hygromycin phosphotransferase). 
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Transfonnacion was confirmed in all cases by production of a DNA fragment from the 
antisense gene after PGR in the presence of appropriate primers and a crude extract of 
genomic DNA from each regenerated shoot. 

Characterisation of starch from potato plants 

Starch was extracted from plants as follows: potato tubers were homogenised in water for 
2 minutes in a Waring blender operating at high speed. The homogenate was washed and 
filtered (initially through 2nim, then through Inun filters) using about 4 litres of water per 
lOOgms of tubers (6 extractions). Washed starch granules were finally extracted with 
acetone and air dried. 

Starch extracted from singly transformed potato plants (class A/SBE n antisense, or class 
B/SBE I antisense), or from double transformants (class A/SBE n and class B/SBE I 
antisense), or from untransformed control plants, was partially characterised. The results 
are shown in Table 1. The table shows the amount of SBE acdvity (units/gram tissue) in 
tubers from each transformed plant. The endotherm peak temperature (^C) of starch 
extracted from several plants was determined by DSC, and the onset temperature (°C) of 
pastmg was determined by reference to a viscoamylograph ("RVA"), as described in WO 
95/26407. The viscoamylograph profile was as follows: step 1 - 50°C for 2 minutes; step 
2 - increase in temperamre from 50°C to 95°C at a rate of 1,5^C per minute; step 3 - 
holding at 95°C for 15 minutes; step 4 - cooling from 95°C to 50°C at a rate of 1.5°C per 
minute; and finally, step 5 - holding at 50°C for 15 minutes. Table 1 shows the peak, 
pastmg and set-back viscosities in sturing number units (SNUs), which is a measure of 
the amount of torque required to stir the suspensions. Peak viscosity may be defined for 
present purposes as the maximun viscosity attamed during die heating phase (step 2) or 
the holding phase (step 3) of the viscoamylograph. Pasting viscosity may be defined as 
the viscosity attained by the starch suspensions at the end of the holdmg phase (step 3) of 
the viscoamylograph. Set-back viscosity may be defined as the viscosity of the starch 
suspension at the end of step 5 of the viscoamylograph. 

A determination of apparent amy lose content (% w/w) was also performed, using the 
iodometric assay method of Morrison & Laignelet (1983 J. Cereal Sci. i, 9-20). The 



wo 96/34968 



PCr/GB96/01075 



28 

results (percentage apparent amylose) are shown in Table 1. The untransfonned and 
transformed control plants gave rise to starches having apparent amylose contents in the 
range 29(+/-3)%. 

Generally similar values for amylose content were obtained for starch extracted from most 
of the singly transformed plants containing the class A (SBE 11) antisense sequence. 
However, some plants (#152, 249) gave rise to starch having an apparent amylose content 
of 37-38 % , notably higher than the control value. Starch extracted from these plants had 
markedly elevated pasting onset temperatures, and starch from plant 152 also exhibited an 
elevated endothenn peak temperature (starch from plant 249 was not tested by DSQ. 
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It should be noted that, even if other single transformants were not to provide starch with 
an altered amylose/amylopectm ratio, the starch from such plants might still have different 
properties relative to starch from conventional plants (e.g. different average molecular 
weight or different amylopeciin branching patterns), which might be useful. 

Double transformant plants, containing antisense sequences for both the class A and class 
B enzymes, had greatly reduced SBE activity (units/gm) compared to untransformed plants 
or single anti-sense class A transformants, (as shown in Table 1). Moreover, certain of 
the double transformant plants contained starch having very significantly altered 
properties. For example, starch extracted from plants #201, 202, 208, 208a, 236 and 
236a had drastically altered amylose/amylopectin ratios, to the extent that amylose was the 
main constituent of starch from these plants. The pasting onset temperatures of starch 
from these plants were also the most greatly mcreased (by about 25-30**C). Starch from 
plants such as #150, 161, 212, 220 and 230a represented a range of intermediates, in that 
such starch displayed a more modest rise in both amylose content and pasting onset 
temperature. The results would tend to suggest that there is generally a correlation 
between % amylose content and pasting onset temperature, which is in agreement with the 
known behaviour of starches from other sources, notably maize. 

The marked increase in amylose content obtained by inhibition of class A SBE alone, 
compared to inhibition of class B SBE alone (see PCT/GB95/00634) might suggest tiiat 
it would be advantageous to transform plants first with a construct to suppress class A SBE 
expression (probably, in practice, an antisense construct), select those plants giving rise 
to starch with the most altered properties, and then to re-transform with a construct to 
suppress class B SBE expression (again, in practice, probably an antisense construct), so 
as to maximise the degree of starch modification. 

In addition to pasting onset temperamres, other feamres of the viscoamylograph profile 
e.g. for starches from plants #149, 150, 152, 161, 201, 236 and 236a showed significant 
differences to starches from control plants, as illustrated in Figure 13. Referring to Figure 
13, a number of viscoamylograph traces are shown. The legend is as follows: shaded box 
- normal potato starch control (29.8% amylose content): shaded circle - starch from plant 
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149 (35.6% amylose): shaded triangle, pointing upwards - plant 152 (37.5%); shaded 
triangle, pointing downwards - plant 161 (40.9%); shaded diamond - plant 150 (53.1%); 
unshaded box - plant 236a (56.7%); unshaded circle - plant 236 (60.4%); unshaded 
triangle, pointing upwards - plant 201 (66.4%); unshaded triangle, pointing downwards - 
Hylon V starch, from maize (44.9 % amylose). The thin line denotes the heating profile. 

With increasing amylose content, peak viscosities during processing to 95°C decrease, and 
the drop in viscosity from the peak until the end of the holding period at 95°C also 
generally decreases (indeed, for some of the starch samples there is an increase in 
viscosity during this period). Both of these results are indicative of reduced granule 
fragmentation, and hence increased granule stability during pasting. This property has not 
previously been available m potato starch without extensive prior chemical or physical 
modification. For applications where a maximal viscosity after processing to 95 ''C is 
desirable (i.e. corresponding to the viscosity after 47 mmutes in the viscoamylograph test), 
starch from plant #152 would be selected as starches with both lower (Controls, #149) and 
higher (#161, #150) amylose contents have lower viscosities followmg this gelatmisation 
and pasting regime (Figure 13 and Table 1). It is believed that the viscosity at this stage 
is determined by a combination of the extent of granule swelling and the resistance of 
swollen granules to mechanical fragmentation. For any desired viscosity behaviour, one 
skilled in the art would select a potato starch from a range containing different amylose 
contents produced according to the invention by performing suitable standard viscosity 
tests. 

Upon cooling pastes from 95 °C to 50*^0, potato starches from most plants transformed 
in accordance with the invention showed an increase in viscoamylograph viscosity as 
expected for partial reassociation of amylose. Starches from plants #149, 152 and 161 all 
show viscosities at 50**C significantly in excess of those for starches from control plants 
(Figure 13 and Table 1). This contrasts with the effect of elevated amylose contents in 
starches from maize plants (Figure 2) which show very low viscosities throughout the 
viscoamylograph test. Of particular note is the fact that, for similar amylose contents, 
starch from potato plant 150 (53% amylose) shows markedly increased viscosity compared 
with Hylon 5 starch (44.9% amylose) as illustrated in Figure 13. This demonstrates that 
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useful properties which require elevated (35% or greater) amylose levels can be obtained 
by processing starches from potato plants below lOO^^C, whereas more energy-intensive 
processing is required in order to generate similarly useful propenies from high amylose 
starches derived from maize plants. 

Final viscosity in the viscoamylograph test (set-back viscosity after 92 minutes) is greatest 
for starch from plant #161 (40.9% amylose) amongst those tested (Figure 13 and Table 
1). Decreasing final viscosities are obtained for starches from plant #152 (37.5% 
amylose), #149 (35.6% amylose) and #150 (53.1% amylose). Set-back viscosity occurs 
where amylose molecules, exuded from the starch granule during pasting, start to re- 
associate outside the granule and form a viscous gel-like substance. It is believed that the 
set-back viscosity values of starches from transgenic potato plants represent a balance 
between the inherent amylose content of the starches and the ability of the amylose 
fraction to be exuded from the granule during pasting and therefore be available for the 
reassociation process which results in viscosity increase. For starches with low amylose 
content, increasing the amylose content tends to make more amylose available for re- 
association, thus increasing the set-back viscosity. However, above a threshold value, 
increased amylose content is thought to inhibit granule swelling, thus preventing exudation 
of amylose from the starch granule and reducing the amount of amylose available for re- 
association. This is supported by the RVA results obtained for the very high amylose 
content potato starches seen in the viscoamylograph profiles in Figure 13. For any 
desired viscosity behaviour following set-back or retrogradation to any desired temperature 
over any desired timescale, one skilled in the an would select a potato starch from a range 
containing different amylose contents produced according to the invention by performing 
standard viscosity tests. 

Further experiments with starch from plants #201 and 208 showed that this had an 
apparent amylose content of over 62% (see Table 1). Viscoamylograph smdies showed 
that starch from these plants had radically altered propenies and behaved in a manner 
-similar to hylon 5 starch from maize plants (Figure 13). Under the conditions employed 
in the viscoamylograph, this starch exhibited extremely limited (nearly undetectable) 
granule swelling. Thus, for example, unlike starch from control plants, starch from plants 
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201, 208 and 208a did not display a clearly defined pasting viscosity peak during the 
heating phase. Microscopic analysis confirmed that the starch granule structure underwent 
only minor swelling during the experimental heating process. This property may well be 
panicularly useful in certain applications, as will be apparent to those skilled in the an. 

Some re-grown plants have so far been found to increase still funher the apparent amylose 
content of starch extracted therefrom. Such increases may be due to:- 

i) Growth and development of the first generation transformed plants may have been 
affected to some degree by the exogenous growth hormones present in the tissue culture 
system, which exogenoous hormones were not present during growth of the second 
generation plants; and 

ii) Subsequent generations were grown under field conditions, which may allow for 
attamment of greater maturity than growth under laboratory conditions, it bemg generally 
held that amylose content of potato starch increases with mamrit}' of the potato tuber. 
Accordingly, it should be possible to obtain potato plants giving rise to tubers with starch 
having an amylose content in excess of the 66% level so far attained, simply by analysing 
a greater number of transformed plants and/or by re-growing transgenic plants through one 
or more generations under field conditions. 

Table 1 shows that another characteristic of starch which is affected by the presence of 
anti-sense sequences to SBE is the phosphorus content. Starch from unttansformed control 
plants had a phosphorus content of about 60-70mg/100gram dry weight (as determined 
according to the AOAC Official Methods of Analysis, 15th Edition, Method 948.09 
"Phosphorus in Flour"). Introduction into the plant of an anti-sense SBE B sequence was 
found to cause a modest mcrease (about two-fold) m phosphorus content, which is in 
agreement with the previous findings reported at scientific meetings. Similarly, anti-sense 
to SBE A alone causes only a small rise in phosphorus content relative to untransformed 
controls. However, use of anti-sense to both SBE A and B in combination results in up 
to a four-fold increase in phosphorus content, which is far greater than any in planta 
phosphorus content previously demonstrated for potato starch. 

This is useful in that, for certain applications, starch must be phosphorylated in vitro by 
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chemical modification. The ability to obtain potato starch which, as extracted from the 
plant, akeady has a high phosphorus content will reduce the amount of in vitro 
phosphorylation required suitably to modify the starch. Thus, in another aspect the 
invention provides pouto starch which, as extracted from the plant, has a phosphorus 
content in excess of 200mg/100gram dry weight starch. Typically the starch will have a 
phosphorus content in the range 200 - 240mg/100gram dry weight starch. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(1) APPLICANT: 

(A) NAME: National Starch and Chemical Investment 

Holding Corporation 

(B) STREET: 501 Silverslde Road. Suite 27 

(C) CITY: Wilmington 

(D) STATE: Delaware 

(E) COUNTRY: United States of America 

(F) POSTAL CODE (ZIP): 19809 

(ii) TITLE OF INVENTION: Improvements in or Relating to Plant Starch 
Composition 

(iii) NUMBER OF SEQUENCES: 20 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0. Version #1.30 (EPO) 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
AAGGATCCGT CGACATCGAT AATACGACTC ACTATA6GGA lltlllllll lllllll 57 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
AAGGATCCGT CGACATC 17 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS; 
(A) LENGTH: ,17 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GACATCGATA ATACGAC 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CATCCAACCA CCATCTCGCA 



(2) INFORMATION FOR SEQ ID NO: 5: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TTGAGAGAAG ATACCTAAGT 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
ATGHCAGTC CATCTAAA6T 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(x1) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AGAACAACAA HCCTAGCTC 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGG6CCTTGA ACTCAGCAAT 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGTCCCA6CA HCGACATAA 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CTTGGATCCT TGAACTCAGC AATTTG 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TAACTCGAGC AAC6CGATCA CAAGTTCGT 29 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3003 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GATGGGGCCT 


TGAACTCAGC 


AATTTGACAC 


TCAGTTAGTT 


ACACTGCCAT 


CACTTATCAG 


60 


ATCTCTATTT 


TTTCTCHAA 


nCCAACCAA 


GGAATGAATA 


AAAAGATAGA 


TTTGTAAAAA 


120 


CCCTAAGGAG 


AGAAGAAGAA 


AGATGGTGTA 


TACACTCTCT 


GGAGTTC6TT 


nCCTACTGT 


180 


TCCATCAGTG 


TACAAATCTA 


ATGGATTCAG 


CAGTAATGGT 


GATCGGAGGA 


ATGCTAATAT 


240 


nCTGTAHC 


TTGAAAAAAC 


ACTCTCTTTC 


ACGGAAGATC 


HGGCTGAAA 


AGTCTTCTTA 


300 


CAAHCCGAA 


TCCCGACCn 


CTACAATTGC 


AGCATCGGGG 


AAAGTCCHG 


TGCCTGGAAT 


360 


CCAGA6TGAT 


AGCTCCTCAT 


CCTCAACAGA 


TCAATTTGAG 


TTCGCTGAGA 


CATCTCCAGA 


420 


AAATTCCCCA 


GCATCAACTG 


ATGTAGATAG 


TTCAACAAT6 


GAACACGCTA 


GCCAGATTAA . 


480 


AACTGAGAAC 


GATGACGTTG 


AGCCGTCAAG 


TGATCTTACA 


GGAAGTGHG 


AAGAGCTGGA 


540 


1 1 1 IGCTTCA 


TCACTACAAC 


TACAAGAAGG 


TGGTAAACTG 


GAGGAGTCTA 


AAACATTAAA 


600 


TACTTCTGAA 


GAGACAATTA 


TTGATGAATC 


TGATA6GATC 


AGAGAGAGGG 


6CATCCCTCC 


660 


ACCTGGACn 


GGTCAGAAGA 


TTTATGAAAT 


AGACCCCCTT 


HGACAAACT 


ATCGTCAACA. 


720 


CCTTGATTAC 


AGGTATTCAC 


AGTACAAGAA 


ACTGAGGGAG 


GCAATTGACA 


AGTATGAGGG 


780 


TGGTTTGGAA 


Gci i 1 1 \rjr 


RTfiRTTATRA 


rvnOMM 1 uOU 1 


TTrArTrCTA 

1 1 l^MU 1 1 M 




o4U 


TATCACTTAC 


CGTGAGTGGG 


CTCCTGGTGC 


CCAGTCAGCT 


GCCCTCATTG 


GGGATTTCAA 


900 


CAAHGGGAC 


GCAAATGCT6 


ACTTTATGAC 


TCGGAATGAA 


TTTGGTGTCT 


GAGAGATTTT 


960 


TCTGCCAAAT 


AATGTGGATG 


GHCTCCTGC 


AATTCCTCAT 


GGGTCCAGAG 


T6AAGATACG 


1020 


TATGGACACT 


CCATCAGGTG 


TTAAGGATTC 


CATTCCTGCT 


TGGATCAACT 


ACTCTTTACA 


1080 


GCnCCTGAT 


GAAAHCCAT 


ATAATGGAAT 


ATATTATGAT 


CCACCCGAAG 


AGGAGAGGTA 


1140 


TATCnCCAA 


CACCCACGGC 


CAAAGAAACC 


AAAGTCGGTG 


AGAATATATG 


AATCTCATAT 


1200 


T6GAATGAGT 


AGTCCGGAGC 


CTAAAATTAA 


CTCATACGTG 


AAI II lAGAG 


ATGAAGTTCT 


1260 


TCCTCGCATA 


AAAAAAGCTT 


GGGTACAATG 


CGGTGCAAAT 


TATGGCTATT 


CAAGAGCATT 


1320 


CHAHATGC 


TAGTTTTGGT 


TATCATGTCA 


CAAAlillll 


TGCACCAA6C 


AGCCGTTTTG 


1380 
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GAACGCCCGA CGACCTTAA6 TCTTTGAnG ATAAAGCTCA TGAGCTAGGA AHGnGTrC 1440 

TCATG6ACAT TGTTCACAGC CATGCATCAA ATAATACTTT AGATGGACT6 AACATGTTTG 1500 

ACGGCACAGA TAGTTGTTAC TTTCACTCTG GAGCTCGTGG TTATCATT6G ATGT6GGATT 1560 

TCCGCCTCTT TAACTATGGA AACTGGGA6G TACHAGGTA TCTTCTCTCA AATGCGAGAT 1620 

GGTGGTTG6A TGAGHCAAA TTTGATGGAT nAGATTTGA TGGT6TGACA TCAATGATGT 1580 

GTACTCACCA CGGATTATCG GTGGGAHCA CTGGGAACTA CGAGGAATAC TTT66ACTCG 1740 

CAACTGAtGT GGATGCT6TT GTGTATCT6A T6CTGGTCAA CGATCTTATT CATGG6CTTT 1800 

TCCCA6ATGC AA7TACCATT G6TGAAGAT6 HAGCGGAAT GCCGACATTT TGTGTTCCCG 1860 

TTCAAGATGG 6GGTGTTGGC TTTGACTATC GGCT6CATAT GGCAATTGCT GATAAAT6GA 1920 

nGAGTTGCT CAAGAAACG6 GATGAGGATT GGAGAGTGGG TGATAHGH CATACACTGA 1980 

CAAATAGAAG ATGGTCGGAA AAGTGTGTTT CATACGCTGA AAGTCATGAT CAAGCTCTAG 2040 

TCGGTGATAA AACTATAGCA nCTGGCTGA TGGACAA6GA TATGTATGAT ITTATGGCTC 2100 

TGGATAGACC GTCAACATCA TTAATAGATC GTGGGATAGC ATTACACAAG ATGATTAGGC 2160 

HGTAACTAT GGGATTAGGA GGAGAAGGGT ACCTAAATTT CATGGGAAAT GAATTCGGCC 2220 

ACCCTGAGTG GATTGATTTC CCTAGGGCTG AACAACACCT CTCTGATGGC TCAGTAATTC 2280 

CCAGAAACCA ATTCAGTTAT GATAAATGCA GACGGAGATT TGACCTGGGA GATGCAGAAT 2340 

ATTTAAGATA CCGTG6GTTG CAAGAATTTG ACCGGGCTAT GCAGTATCTT GAAGATAAAT 2400 

ATGAGITTAT GACTTCAGAA CACCAGTTCA TATCACGAAA GGATGAAGGA GATAGGATGA 2460 

TTGTATTTGA AAAAGGAAAC CTAGTTTTTG TCTTTAATTT TCACTGGACA AAAGGCTAH 2520 

CAGACTATCG CATAGGCTGC CTGAAGCCTG GAAAATACAA GGTTGCCTT6 GACTCAGATG 2580 

ATCCACTTTT TGGTGGCnC GGGAGAATTG ATCATAATGC CGAATATTTC ACCTTTGAAG 2640 

GATGGTATGA TGATCGTCCT CGTrCAAHA TGGTGTATGC ACCTAGTAGA ACAGCAGTGG 2700 

TCTAT6CACT AGTAGACAAA GAAGAAGAAG AAGAAGAAGA AGTAGCAGTA GTAGAAGAAG 2760 

TAGTAGTAGA AGAAGAATGA ACGAACTTGT GATCGCGTTG AAAGAT7TGA ACGCCACATA 2820 

GAGCnCTTG ACGTATCTGG CAATATTGCA TTAGTCTTGG CGGAA7TTCA TGTGACAACA 2880 

GGTTTGCAAT TCTTTCCACT ATTAGTAGTG CAACGATATA CGCAGAGATG AAGTGCTGAA 2940 

CAAAAACATA TGTAAAATCG ATGAATTTAT GTCGAATGCT GGGACGATCG AATTCCTGCA 3000 

GCC 3003 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2975 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TTGATGG6CC HGAACTCAG CAATTTGACA CTCAGTTAGT TACACTCCTA TCACTTATCA 60 

GATCTCTAH TlTTCTCnA AHCCAACCA GG6GAATGAA TAAAAGGATA GATTTGTAAA 120 

AACCCTAAGG AGAGAAGAAG AAAGAT6GT6 TATATACTCT CTG6A6TTCG TTTTCCTACT 180 

GHCCATCAG TGTACAAATC TAATGGATTC AGCAGTAAT6 GTGATCGGA6 GAATGCTAAT 240 

GTTTCTGTAT TCTTGAAAAA GCACTCTCTT TCACG6AAGA TCTTG6CT6A AAAGTCTTCT 300 

TACAATTCCG AATTCCGACC TTCTACAGTT GCAGCATCGG GGAAAGTCCT TGTGCCTGGA 360 

ACCCAGAGTG ATAGCTCCTC ATCCTCAACA GACCAATTTG AGHCACTGA GACATCTCCA 420 

GAAAATTCCC CAGCATCAAC TGATGTAGAT AGHCAACAA TGGAACACGC TAGCCAGATT 480 

AAAACT6A6A ACGATGAC6T TGAGCCGTCA AGTGATCTTA CAGGAA6TGT TGAAGAGCTG 540 

GATTTTGCrr CATCACTACA ACTACAAGAA GGTGGTAAAC TGGAGGA6TC TAAAACATTA 600 

AATACTTCTG AAGAGACAAT TATTGATGAA TCTGATAGGA TCAGAGAGAG GGGCATCCCT 660 

CCACCTGGAC HGCTCAGAA GATTTATGAA ATAGACCCCC TTTT6ACAAA CTATCGTCAA 720 

CACCTTGAn ACAGGTATTC ACAGTACAA6 AAACTGAGGG AGiGCAATTGA CAAGTATGAG 780 

GGT6GTTTGG AAGCTTTTCT CGTGGTTATG AAAAAAT6GG TTTCACTCGT AGTGCTACAG 840 

GTATCACTTA CCGTGA6TGG GCTCCTGGTG CCCAGTCA6C TGCCCTCATT GGAGATTTCA 900 

ACAATTGGGA CGCAAATGCT GACATTATGA CTCGGAATGA ATTTGGTGTC TGGGAGATTT 960 

nCTGCCAAA TAATGTGGAT GGTTCTCCT6 CAATTCCTCA TGGGTCCAGA 6TGAAGATAC 1020 

GTATGGACAC TCCATCAGGT GHAAGGATT CCATTCCT6C TTGGATCAAC TACTCTTTAC 1080 

AGCTTCCTGA TGAAATTCCA TATAATGGAA TATATTAT6A TCCACCCGAA GAGGAGAG6T 1140 

ATATCTTCCA ACACCCACGG CCAAAGAAAC CAAAGTCGCT GAGAATATAT GAATCTCATA 1200 

TTGGAATGAG TAGTCCGGAG CCTAAAATTA ACtCATACGT GAATTTTAGA GATGAAGHC 1260 

TTCCTCGCAT AAAAAAGCTT GG6TACAATG CGCTGCGAAT TATGGCTATT CAAGAGCATT 1320 

CHAHATGC TAGTTTrGGT TATCATGTCA CAAATTTTTT T6CACCAAGC AGCCGTTrTG 1380 
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GAACGCCCGA 


CGACCTTAAG 


TCTTCGATTG 


ATAAAGCTCA 


T6AGCTA6GA 


AnGTTGTTC 


1440 


TCATGGACAT 


CGTTCACAGC 


CATGCATCAA 


ATAATACTTT 


AGATGGACT6 


AACATGTTTG 


1500 


ACGGCACCGA 


TAGnGTTAC 


TTTCACTCTG 


GAGCTCGTG6 


TTATCATTGG 


ATGIGGGAH 


1560 


CCGCCTCTTT 


AACTATGGAA 


ACTGGGAGGT 


ACTTAGGTAT 


CTTCTCTCAA 


ATGCGAGAT6 


1620 


GTGGTTGGAT 


GAGTTCAAAT 


TTGATGGAn 


TAGATTCGAT 


GGTGTGACAT 


CAATGATGTA 


1680 


TACTCACCAC 


6GATTATCGG 


TGGGATTCAC 


TGGGAACTAC 


GAGGAATACT 


TTGGACTCGC 


1740 


AACTGATGTG 


GATGCTGTTG 


TGTATCTGAT 


GCTGGTCAAC 


GATCTTATTC 


ATAGGCTTTT 


1800 


CCCAGATGCA 


ATTACCATTG 


GTGAAGATGT 


TAGCGGAATG 


CCGACAI 1 1 1 


GTAHCCCGT 


1860 


TCAAGATGGG 


GGTGTTGGCT 


TTGACTATCG 


GCTGCATATG 


GCAAHGCTG 


ATAAATGGAT 


1920 


TGAGHGCTC 


AAGAAAC6GG 


ATGAGGATTG 


GAGAGTGGGT 


GATATTGTTC 


ATACACTGAC 


1980 


AAATAGAAGA 


TGGTCGGAAA 


AGTGTGTTTC 


ATACGCTGAA 


AGTCATGATC 


AAGCTCTAGT 


2040 


CGGTGATAAA 


ACTATAGCAT 


TCTGGCTGAT 


6GACAAGGAT 


ATGTATGAH 


TTATG6CTCT 


2100 


GGATAGACCG 


CCAACATCAT 


TAATAGATCG 


TGGGATAGCA 


TTGCACAAGA 


TGATTAGGCT • 


2160 


TGTAACTATG 


GGATTA6GAG 


GAGAAGGGTA 


CCTAAATTTC 


AT66GAAAT6 


AATTC6GCCA 


2220 


CCCTGAGTGG 


ATTGATTTCC 


CTAGGGCTGA 


GCCACACCn 


TCTGATGGCT 


CAGTAATTCC 


2280 


CGGAAACCAA 


TTCAGTTATG 


ATAAATGCAG 


ACGGAGATTT 


GACCTGGGAG 


ATGCAGAATA 


2340 


TTTAAGATAC 


CATGGGTTAC 


AAGAATTTGA 


CTGGGCTATG 


CAGTATCTTG 


AAGATAAATA 


2400 


TGAGTTTATG 


ACTTCAGAAC 


ACCAGTTCAT 


ATCACGAAAG 


GATGAAGGAG 


ATAGGATGAT 


2460 


TGTAnTGAA 


AGAGGAAACC 


TAGTnrCGT 


CTTTAAI 1 1 1 


CACTGGACAA 


ATAGCTATTC 


2520 


AGACTATCGC 


ATAGGCTGCC 


TGAAGCCTGG 


AAAATACAAG 


GTTGTCTTGG 


ACTCAGATGA 


2580 


TCCACIIIII 


GGTGGCnCG 


GGAGAAHGA 


TCATAATGCC 


GAATATTTCA 


CCTCTGAAGG 


2640 


ATCGTATGAT 


GATCGTCCTT 


GnCAATTAT 


GGTGTATGCA 


CCTAGTAGAA 


CAGCAGTGGT 


2700 


CTATGCACTA 


6TAGACAAAC 


TAGAAGTAGC 


AGTAGTAGAA 


GAACCCATTG 


AAGAATGAAC 


2760 


GAACHGTGA 


TCGCGTT6AA 


AGATTTGAAC 


GTTACTTGGT 


CATCCACATA 


GAGCTTCTTG 


2820 


ACATC.A6TCT 


TGGCGGAATT 


GCATGTGACA 


ACAAGGTTTG 


CAGTTCnTC 


CACTATTAGT 


2880 


AGTCCACCGA 


TATACGCA6A 


GATGAAGTGC 


T6AACAAACA 


TATGTAAAAT 


CGAT6AATTT 


2940 


ATGTCGAATG 


CTG6GACGAT 


CGAAHCCTG 


CAGCC 






2975 
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(2) INFORMATION FOR SEQ 10 NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3033 base pairs 

(B) TYPE: nucleic acid 

(C) STRAjMDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 145.. 2790 

(x1) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTGATGG6GC CHGAACTCA GCAATTTGAC ACTCAGHAG TTACACTCCT ATCACHATC 

AGATCTCTAT TTTTTCTCn AATTCCAACC AAGGAATGAA TAAAAGGATA GATTTGTAAA 

AACCCTAAGG AGAGAAGAAG AAAG ATG GTG TAT ACA CTC TCT GGA GTT CGT 

Met Val Tyr Thr Leu Ser Gly Val Arg 



TTT CCT ACT GTT CCA TCA GTG TAC AAA TCT AAT GGA TTC AGC AGT AAT 
Phe Pro Thr Val Pro Ser Val Tyr Lys Ser Asn Gly Phe Ser Ser Asn 
10 15 20 25 

6GT GAT CG6 AGG AAT GCT AAT GTT TCT GTA HC TTG AAA AAG CAC TCT 
Gly Asp Arg Arg Asn Ala Asn Val Ser Val Phe Leu Lys Lys His Ser 
30 35 40 

CTT TCA CGG AAG ATC TTG GCT GAA AAG TCT TCT TAC AAT TCC GAA TTC 
Leu Ser Arg Lys He Leu Ala 'Glu Lys Ser Ser Tyr Asn Ser Glu Phe 
45 50 55 

CGA CCT TCT ACA GTT GCA GCA TCG GGG AAA GTC CTT GTG CCT GGA ACC 
Arg Pro Ser Thr Val Ala Ala Ser Gly Lys Val Leu Val Pro Gly Thr 
60 65 70 

CAG AGT GAT AGC TCC TCA TCC TCA ACA GAC CAA TIT GAG TTC ACT GAG 
Gin Ser Asp Ser Ser Ser Ser Ser Thr Asp Gin Phe Glu Phe Thr Glu 
75 80 85 

ACA TCT CCA GAA AAT TCC CCA GCA TCA ACT GAT GTA GAT AGT TCA ACA 
Thr Ser Pro Glu Asn Ser Pro Ala Ser Thr Asp Val Asp Ser Ser Thr 
90 95 100 105 

ATG GAA CAC GCT AGC CAG ATT AAA ACT GAG AAC GAT GAC GTT GA.G CC6 
Met Glu His Ala Ser Gin He Lys Thr Glu Asn Asp Asp Val Glu Pro 

110 115 120 

TCA AGT GAT CTT ACA GGA AGT GTT GAA GAG CTG GAT TTT GCT TCA TCA 
Ser Ser Asp Leu Thr Gly Ser Val Glu Glu Leu Asp Phe Ala Ser Ser 
125 130 135 
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CTA CM CTA CAA GAA GGT GGT AAA CTG GAG GAG TCT AAA ACA TTA AAT 603 
Leu Gin Leu Gin Glu Gly Gly Lys Leu Glu Glu Ser Lys Thr Leu Asn 
140 145 . 150 

ACT TCT GAA GAG ACA AH ATT GAT GAA TCT GAT AGG ATC AGA GAG AGG 651 
Thr Ser Glu Glu Thr He He Asp Glu Ser Asp Arg He Arg Glu Arg 
155 160 165 

GGC ATC CCT CCA CCT GGA CTT GGT CAG AAG AH TAT GAA ATA GAC CCC 699 
Gly He Pro Pro Pro Gly Leu Gly Gin Lys He Tyr Glu He Asp Pro 
170 175 180 185 

CTT TTG ACA AAC TAT CGT CAA CAC CTT GAT TAC AGG TAT TCA CAG TAC 747 
Leu Leu Thr Asn Tyr Arg Gin His Leu Asp Tyr Arg Tyr Ser Gin Tyr 
190 195 200 

AAG AAA CTG AGG GAG GCA ATT GAC AAG TAT GAG GGT GGT TTG GAA GCC 795 
Lys Lys Leu Arg Glu Ala He Asp Lys Tyr Glu Gly Gly Leu Glu Ala 
205 210 215 

TTT TCT CGT GGT TAT GAA AAA ATG GGT TTC ACT CGT AGT GCT ACA GGT 843 
Phe Ser Arg Gly Tyr Glu Lys Met Gly Phe Thr Arg Ser Ala Thr Gly 
220 225 230 

ATC ACT TAC CGT GAG TGG GCT CTT GGT GCC CAG TCA GCT GCC CTC ATT 891 
He Thr Tyr Arg Glu Trp Ala Leu Gly Ala Gin Ser Ala Ala Leu He 
235 240 245 

GGA GAT nC AAC AAT TGG GAC GCA AAT GCT GAC ATT ATG ACT CG6 AAT 939 
Gly Asp Phe Asn Asn Trp Asp Ala Asn Ala Asp He Met Thr Arg Asn 
250 255 260 265 

GAA TTT GGT 6TC TGG GAG ATT TTT CTG CCA AAT AAT GTG GAT GGT TCT 987 
Glu Phe Gly Val Trp Glu He Phe Leu Pro Asn Asn Val Asp Gly Ser 
270 275 280 

CCT GCA ATT CCT CAT GGG TCC AGA GTG AAG ATA CGT ATG GAC ACT CCA 1035 
Pro Ala He Pro His Gly Ser Arg Val Lys He Arg Met Asp Thr Pro 
285 290 295 

TCA GGT GTT AAG GAT TCC ATT CCT GCT TGG ATC AAC TAC TCT TTA CAG 1083 
Ser Gly Val Lys Asp Ser He Pro Ala Trp He Asn Tyr Ser Leu Gin 
300 305 310 

Cn CCT GAT GAA ATT CCA TAT AAT GGA ATA CAT TAT GAT CCA CCC GAA 1131 
Leu Pro Asp Glu He Pro Tyr Asn Gly He His Tyr Asp Pro Pro Glu 

315 320 325 

GAG GAG AGG TAT ATC TTC CAA CAC CCA C6G CCA AAG AAA CCA AAG TCG 1179 
Glu Glu Arg Tyr He Phe Gin His Pro Arg Pro Lys Lys Pro Lys Ser 
330 335 340 345 

CTG AGA ATA TAT GAA TCT CAT ATT GGA ATG AGT AGT CCG GAG CCT AAA ' 1227 
Leu Arg He Tvr Glu Ser His He Gly Met Ser Ser Pro Glu Pro Lys 
350 355 360 
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ATT AAC TCA TAC GTG AAT TTT AGA GAT GAA GTT CTT CCT CGC ATA AAA 1275 
He Asn Ser Tyr Val Asn Phe Arg Asp 61 u Val Leu Pro Arg He Lys 
365 370 375 

AAG CTT GGG TAC AAT GCG CTG CAA ATT ATG GCT AH CAA GAG CAT TCT 1323 
Lys Leu Gly Tyr Asn Alj Leu Gin He Met Ala He Gin Glu His Ser 
380 385 390 

TAT TAC GCT AGT TTT GGT TAT CAT GTC ACA AAT TH TTT GCA CCA AGC 1371 
Tyr Tyr Ala Ser Phe Gly Tyr His Val Thr Asn Phe Phe Ala Pro Ser 
395 400 405 

AGC CGT TTT GGA ACG CCC GAC GAC CTT AAG TCT TTG ATT GAT AAA GCT 1419 
Ser Arg Phe Gly Thr Pro Asp Asp Leu Lys Ser Leu He Asp Lys Ala 
410 415 420 425 

CAT GAG CTA GGA ATT GTT GTT CTC ATG GAC ATT GTT CAC AGC CAT GCA 1467 
His Glu Leu Gly lie Val Val Leu Met Asd He Val His Ser His Ala 
430 435 440 

TCA AAT AAT ACT TTA GAT GGA CTG AAC ATG TTT GAC TGC ACC GAT AGT 1515 
Ser Asn Asn Thr Leu Asp Gly Leu Asn Met Phe Asp Cys Thr Asp Ser 
445 450 455 

TGT TAC TTT CAC TCT GGA GCT CGT GGT TAT CAT TGG ATG TGG GAT TCC 1563 
Cys Tyr Phe His Ser Gly Ala Arg Gly Tyr His Trp Met Trp Asp Ser 
460 465 470 

CGC CTC TTT AAC TAT GGA AAC TGG GAG GTA CTT AGG TAT CTT CTC TCA 1611 
Arg Leu Phe Asn Tyr Gly Asn Trp Glu Val Leu Arg Tyr Leu Leu Ser 
475 480 485 

AAT GCG AGA TGG TGG TTG GAT GCG TTC AAA TTT GAT GGA TTT AGA TTT 1659 
Asn Ala Arg Trp Trp Leu Asp Ala Phe Lys Phe Asp Gly Phe Arg Phe 
490 495 500 505 

GAT GGT GTG ACA TCA ATG ATG TAT ATT CAC CAC GGA TTA TCG GTG GGA 1707 
Asp Gly Val Thr Ser Met Met Tyr He His His Gly Leu Ser Val Gly 
510 515 520 

TTC ACT GGG AAC TAC GAG GAA TAC TTT GGA CTC GCA ACT GAT GTG GAT 1755 
Phe Thr Gly Asn Tyr Glu Glu Tyr Phe Gly Leu Ala Thr Asp Val Asp 
525 . 530 535 

GCT GH GTG TAT CTG ATG CTG GTC AAC GAT CTT ATT CAT GGG CTT TTC 1803 
Ala Val Val Tyr Leu Met Leu Val Asn Asp Leu He His Gly Leu Phe 
540 545 550 

CCA GAT GCA ATT ACC ATT GGT GAA GAT GTT AGC GGA ATG CCG ACA TTT 1851 
Pro Asp Ala He Thr He Gly Glu Asp Val Ser Gly Met Pro Thr Phe 
555 560 565 

TGT AH CCC GTC CAA GAG GGG GGT GTT G6C TTT GAC TAT CGG CTG CAT 1899 
Cys He Pro Val Gin Glu Gly Gly Val Gly Phe Asp Tyr Arg Leu His 
570 575 580 585 
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ATG GCA ATT GCT GAT AAA CGG ATT GAG TTG CTC AAG AAA CGG GAT GAG 1947 
Met Ala He Ala Asp Lys Arg He Glu Leu Leu Lys Lys Arg Asp Glu 
590 595 600 

GAT TGG AGA GTG GGT GAT ATT GU CAT ACA CTG ACA AAT AGA AGA T66 1995 
Asp Trp Arg Val Gly Asp He Val His Thr Leu Thr Asn Arg Arg Trp 
605 610 615 

TCG GAA AAG TGT GTT TCA TAG GCT GAA AGT CAT GAT CAA GCT CTA 6TC 2043 
Ser Glu Lys Cys Val Ser Tyr Ala Glu Ser His Asp Gin Ala Leu Val 
620 625 630 

GGT GAT AAA ACT ATA GCA TTC TGG CTG ATG GAC AAG GAT ATG TAT GAT 2091 
Gly Asp Lys Thr He Ala Phe Trp Leu Met Asp Lys Asp Met Tyr Asp 
635 640 645 

TTT ATG GCT CTG GAT AGA CCG TCA ACA TCA TTA ATA GAT CGT GGG ATA 2139 
Phe Met Ala Leu Asp Arg Pro Ser Thr Ser Leu He Asp Arg Gly He 
650 655 660 • 665 

GCA TTG CAC AAG ATG ATT AGG CTT GTA ACT ATG GGA TTA GGA GGA GAA 2187 
Ala Leu His Lys Met He Arg Leu Val Thr Met Gly Leu Gly Gly Glu 
670 675 680 

GGG TAC CTA AAT TTC ATG GGA AAT GAA TTC GGC CAC CCT GAG TGG ATT 2235 
Gly Tyr Leu Asn Phe Met Gly Asn Glu Phe Gly His Pro Glu Trp He 
685 690 695 

GAT TTC CCT AGG GCT GAA CAA CAC CTC TCT GAT GGC TCA GTA ATC CCC 2283 
Asp Phe Pro Arg Ala Glu Gin His Leu Ser Asp Gly Ser Val He Pro 
700 705 710 

GGA AAC CAA HC AGT TAT GAT AAA TGC AGA CGG AGA TTT GAC CTG GGA 2331 
Gly Asn Gin Phe Ser Tyr Asp Lys Cys Arg Arg Arg Phe Asp Leu Gly 
715 720 725 

GAT GCA GAA TAT TTA AGA TAC CGT GGG TTG CAA GAA TTT GAC CGG CCT 2379 
Asp Ala Glu Tyr Leu Arg Tyr Arg Gly Leu Gin Glu Phe Asp Arg Pro 
730 735 740 745 

ATG CAG TAT CH GAA GAT AAA TAT GAG TTT ATG ACT TCA GAA CAC CAG 2427 
Met Gin Tyr Leu Glu Asp Lys Tyr Glu Phe Met Thr Ser Glu His Gin 
750 755 760 

nC ATA TCA CGA AAG GAT GAA GGA GAT AGG ATG ATT GTA TTT GAA AAA 2475 
Phe He Ser Arg Lys Asp Glu Gly Asp Arg Met He Val Phe Glu Lys 
765 770 775 

GGA AAC CTA GTT TTT GTC TTT AAT TTT CAC TGG ACA AAA AGC TAT TCA 2523 
Gly Asn Leu Val Phe Val Phe Asn Phe His Trp Thr Lys Ser Tyr Ser 
780 • 785 790 

GAC TAT CGC ATA 6CC TGC CTG AAG CCT GGA AAA TAC AAG GTT GCC TTG 2571 
Asp Tyr Arg He Ala Cys Leu Lys Pro Gly Lys Tvr Lys Val Ala Leu 
795 800 805 
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GAC TCA GAT GAT CCA CTT TTT G6T GGC TTC GGG AGA ATT GAT CAT AAT 2619 
Asp Ser Asp Asp Pro Leu Phe Gly Gly Phe Gly Arg He Asp His Asn 
810 815 820 825 

GCC GAA TAT HC ACC TTT GAA GGA TGG TAT GAT GAT C6T CCT CGT TCA 2667 
Ala Glu Tyr Phe Thr Phe Glu Gly Trp Tyr Asp Asp Arg Pro Arg Ser 
830 835 840 

ATT AT6 GTG TAT GCA CCT TGT AAA ACA GCA GTG GTC TAT GCA CTA GTA 2715 
lie Met Val Tyr Ala Pro Cys Lys Thr Ala Val Val Tyr Ala Leu Val 
845 850 855 

GAC AAA GAA GAA GAA GAA GAA GAA GAA GAA GAA GAA GAA GTA GCA GCA 2763 
Asp Lys Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Val Ala Ala 
860 865 870 

GTA GAA GAA GTA GTA GTA GAA GAA GAA TGAACGAACT TGTGATCGCG 2810 
Val Glu Glu Val Val Val Glu Glu Glu 
875 880 

TTGAAAGATT TGAACGCTAC ATAGAGCHC TTGACGTATC TGGCAATATT GCATCAGTCT 2870 

TGGCGGAATT TCATGTGACA CAAGGTTTGC AATTCTTTCC ACTATTAGTA GTGCAACGAT 2930 

ATAC6CAGAG ATGAAGTGCT GAACAAACAT AT6TAAAATC GATGAATHA T6TCGAATGC 2990 

TG66ACGATC GAATTCCTGC A66CCGG6GG ACCCCTTA6T TGT ' .3033 

(2) INFORMATION FOR.SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Val Tyr Thr Leu Ser Gly Val Arg Phe Pro Thr Val Pro Ser Val 
1 5 10 15 

Tyr Lys Ser Asn Gly Phe Ser Ser Asn Gly Asp Arg Arg Asn Ala Asn 
20 25 30 

Val Ser Val Phe Leu Lys Lys His Ser Leu Ser Arg Lys He Leu Ala 

35 40 45 

Glu Lys Ser Ser Tyr Asn Ser Glu Phe Arg Pro Ser Thr Val Ala Ala 
50 55 60 

Ser Gly Lys Val Leu Val Pro Gly Thr Gin Ser Asp Ser Ser Ser Ser 
65 70 75 80 

Ser Thr Asp Gin Phe Glu Phe Thr Glu Thr Ser Pro Glu Asn Ser Pro 
85 90 95 
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Ala Ser Thr Asp Va1 Asp Ser Ser Thr Met Glu His Ala Ser Gin He 
100 105 110 

Lys Thr Glu Asn Asp Asp Val Glu Pro Ser Ser Asp Leu Thr Gly Ser 
115 120 125 

Val Glu Glu Leu Asp Phe Ala Ser Ser Leu Gin Leu Gin Glu Gly Gly 
130 135 140 

Lys Leu Glu Glu Ser Lys Thr Leu Asn Thr Ser Glu Glu Thr He He 
145 150 155 160 

Asp Glu Ser Asp Arg He Arg Glu Arg Gly He Pro Pro Pro Gly Leu 
165 170 175 

Gly Gin Lys He Tyr Glu He Asp Pro Leu Leu Thr Asn Tyr Arg Gin 
180 185 190 

His Leu Asp Tyr Arg Tyr Ser Gin Tyr Lys Lys Leu Arg Glu Ala He 
195 200 205 

Asp Lys Tyr Glu Gly Gly Leu Glu Ala Phe Ser Arg Gly Tyr Glu Lys 

Met Gly Phe Thr Arg Ser Ala Thr Gly He Thr Tyr Arg Glu Trp Ala 
225 230 235 240 

Leu Gly Ala- Gin Ser Ala Ala Leu He Gly Asp Phe Asn Asn Tro Asd 
245 250 255 

Ala Asn Ala Asp He Met Thr Arg Asn Glu Phe Gly Val Trp Glu He 
260 255 270 

Phe Leu Pro Asn Asn Val Asp Gly Ser Pro Ala He Pro His Gly Ser 
275 280 285 

'^''9 III '^st Asp Thr Pro Ser Gly Val Lys Asp Ser He 

290 295 300 

Pro Ala Trp He Asn Tyr Ser Leu Gin Leu Pro Asp Glu He Pro Tyr 
305 310 315 320 

Asn Gly He His Tyr Asp Pro Pro Glu Glu Glu Arg Tyr He Phe Gin 
325 330 335 

His Pro Arg Pro Lys Lys Pro Lys Ser Leu Arg He Tyr Glu Ser His 
340 345 350 

He Gly Met Ser Ser Pro Glu Pro Lys He Asn Ser Tyr Val Asn Phe 
355 360 365 

Arg Asp Glu Val Leu Pro Arg He Lys Lys Leu Gly Tyr Asn Ala Leu 
370 375 380 

Gin He Met Ala He Gin Glu His Ser Tyr Tyr Ala Ser Phe Gly Tyr 
385 390 395 400 
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• His Val Thr Asn Phe Phe Ala Pro Sen Ser Arg Phe 61y Thr Pro Asp 
405 410 415 

Asp Leu Lys Ser Leu He Asp Lys Ala His Glu Leu 61y He Val Val 
420 425 430 

Leu Met Asp He Val His Ser His Ala Ser Asn Asn Thr Leu Asp Gly 
435 440 445 

Leu Asn Met Phe Asp Cys Thr Asp Ser Cys Tyr Phe His Ser Gly Ala 
450 455 460 

Arg Gly Tyr His Trp Met Trp Asp Ser Arg Leu Phe Asn Tyr Gly Asn 
465 470 475 480 

Trp Glu Val Leu Arg Tyr Leu Leu Ser Asn Ala Arg Trp Trp Leu Asp 
485 490 495 

Ala Phe Lys Phe Asp Gly Phe Arg Phe Asp Gly Val Thr Ser Met Met 
500 505 510 

Tyr He His His Gly Leu Ser Val Gly Phe Thr Gly Asn Tyr Glu Glu 
515 520 525 

Tyr Phe Gly Leu Ala Thr Asp Val Asp Ala Val Val Tyr Leu' Met Leu 
530 535 540 

Val Asn Asp Leu He His Gly Leu Phe Pro Asp Ala He Thr He Gly 
545 550 555 560 

Glu Asp Val Ser Gly Met Pro Thr Phe Cys He Pro Val Gin Glu Gly 
565 570 575 

Gly Val Gly Phe- Asp Tyr Arg Leu His Met Ala He Ala Asp Lys Arg 
580 585 590 

He Glu Leu Leu Lys Lys Arg Asp Glu Asp Trp Arg Val Gly Asp He 
595 600 605 

Val His Thr Leu Thr Asn Arg Arg Trp Ser Glu Lys Cys Val Ser Tyr 
610 615 620 

Ala Glu. Ser His Asp Gin Ala Leu Val Gly Asp Lys Thr He Ala Phe 
625 630 635 640 

Trp Leu Met Asp Lys Asp Met Tyr' Asp Phe Met Ala Leu Asp Arg Pro 
645 650 655 

Ser Thr Ser Leu He Asp Arg Gly He Ala Leu His Lys Met He Arg 
660 665 670 

Leu Val Thr Met Gly Leu Gly Gly Glu Gly Tyr Leu Asn Phe Met Gly 
675 680 685 

Asn Glu Phe Gly His Pro Glu Trp He Asp Phe Pro Arg Ala Glu Gin 
590 695 . 700 
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His Leu Ser Asp Gly Sen Val He Pro Gly Asn Gin Phe Ser Tyr Asp 
705 710 715 720 

Lys Cys Arg Arg Arg Phe Asp Leu Gly Asp Ala Glu Tyr Leu Arg Tyr 
725 730 735 

Arg Gly Leu Gin Glu Phe Asp Arg Pro Met Gin Tyr Leu Glu Asp Lys 
740 745 750 

Tyr Glu Phe Met Thr Ser Glu His Gin Phe He Ser Arg Lys Asp Glu 
755 750 765 

Gly Asp Arg Met He Val Phe Glu Lys Gly Asn Leu Val Phe Val Phe 
770 775 780 

Asn Phe His Trp Thr Lys Ser Tyr Ser Asp Tyr Arg He Ala Cys Leu 
785 790 795 800 

Lys Pro Gly Lys Tyr Lys Val Ala Leu Asp Ser Asp Asp Pro Leu Phe 
805 810 815 

Gly Gly Phe Gly Arg He Asp His Asn Ala Glu Tyr Phe Thr Phe Glu 
820 825 830 

Gly Trp Tyr Asp Asp Arg Pro Arg Ser He Met Val Tyr Ala Pro Cys 
835 840 845 

Lys Thr Ala Val Val Tyr Ala Leu Val Asp Lys Glu Glu Glu Glu Glu 
850 855 860 

Glu Glu Glu Glu Glu Glu Val Ala Ala Val Glu Glu Val Val Val Glu ' 
865 870 875 880 

Glu Glu 



(2) INFORMATION FOR.SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 2576 base pairs . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TCATTAAAGA GGAGAAATTA ACTATGAGAG GATCTCACCA TCACCATCAC CATGGGATCT 60 

TG6CTGAAAA GTCTTCTTAC AAHCCGAAT TCCGACCTTC TACAGTTGCA GCATCG6GGA 120 

AAGTCCHGT 6CCTGGAACC CAGAGTGATA GCTCCTCATC CTCAACAAAC CAATTTGAGT 180 

TCACT6A6AC ATCTCCAGAA AAHCCCCAG CATCAACTGA TGTAGATA6T TCAACAATGG 240 

AACACGCTAG CCAGATTAAA ACTGAGAACG ATGACGTTGA GCCGTCAAGT 6ATCTTACAG 300 
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GAAGT6TTGA AGAGCTGGAT TTTGCTTCAT CACTACAACT ACAAGAAGGT GGTAAACT6G 360 

AGGA6TCTAA AACAHAAAT ACTTCTGAA6 AGACMTTAT TGAT6AATCT GATAGGATCA 420 

GAGAGAGGGG CATCCCTCCA CCIGGACHG GTCAGAAGAT TTATGAAATA GACCCCCTTT 480 

TGACAAACTA TCGTCAACAC CTTGATTACA 6GTATTCACA GTACAAGAAA CTGAGGGAGG 540 

CAAHGACAA GTATGAG6GT GGTTTGGAAG CTTTiTCTCG TGGTTAT6AA AAAATGGGH 600 

TCACTCGTAG TGCTACAGGT ATCACHACC GTGAGTGGGC TCCTGGTGCC CAGTCAGCTG 660 

CCCTCAnGG AGATTTCAAC AATTGGGACG CAAATGCTGA CAHATGACT CGGAATGAAT 720 

TTGGTGTCTG GGAGATTTn CTGCCAAATA ATGTGGATGG TTCTCCTGCA ATTCCTCATG 780 

GGTCCAGAGT GAAGATAC6T ATGGACACTC CATCAGGTGT TAAGGATTCC ATTCCTGCn 840 

GGATCAACTA CTCTACAGCT TCCTGATGAA ATTCCATATA ATGGAATATA TTATGATCCA 900 

CCCGAAGAGG AGAGGTATAT CTTCCAACAC CCACGGCCAA AGAAACCAAA 6TCGCT6AGA 960 

ATATATGAAT CTCATAnGG AATGAGTAGT CCGGAGCCTA AAATTAACTC ATACGTGAAT 1020 

TTTAGAGATG AAGTTCTTCC TCGCATAAAA AAGCnGGGT ACAATGCGCT GCAAATTAT6 1080 

GCTATTCAAG AGCATTCTTA HATGCTAGT TUGGnATC ATGTCACAAA 11 1 1 1 1 IG CA 1140 

CCAAGCAGCC GTTrTGGAAC GCCCGACGAC CTTAAGTCn TGATTGATAA AGCTCATGAG 1200 

CTAGGAAHG TTGTTCTCAT GGACAnGTT CACAGCCATG CATCAAATAA TACTTTAGAT 1260 

6GACTGAACA TGTTTGACGG CACCGATAGT TGnACTiTC ACTCTGGAGC TC6TGGTTAT 1320 

CATTG6ATGT 6GGATTCCCG CCTTTTTAAC TATGGAAACT GGGAGGTACT TAGGTATCTT 1380 

CTCTCAAATG CGAGATGGTG GTTGGATGAG nCAAATTTG ATGGATTTAG ATTTGATGGT 1440 

GTGACATCAA TGATGTATAC TCACCACGGA TTATCGGTGG GATTCACTGG GAACTACGAG 1500 

GAATACTTTG GAaCGCAAC TGATGTGGAT GCT6TTGTGT ATCTGATGCT GGTCAACGAT 1560 

CnATTCATG GGCTTTTCCC AGATGCAATT ACCATTGGTG AAGATGTTAG CGGAAT6CCG 1620 

ACATTTTGTA TTCCCGTTCA AGATGGGGGT GTTG6CTTTG ACTATCGGCT GCATATGGCA 1680 

AHGCTGATA AATGGAHGA GTTGCTCAAG AAACGGGATG AGGATTGGAG AGT6GGTGAT 1740 

ATTGnCATA CACTGACAAA TAGAAGATGG TCGGAAAAGT 6TGTTTCATA CGCTGAAA6T 1800 

CAT6ATCAAG CTCTAGTCGG TGATAAAACT ATAGCATTCT GGCTGATGGA CAAGGATATG 1860 

TATGATTTTA TGGCTCTG6A TAGACCGCCA ACATCATTAA TAGATCGTGG GATAGCATTG 1920 

CACAAGATGA HAGGCnGT AACTATGGGA HAGGAGGAG AAGGGTACCT AAATTTCATG 1980 
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GGAAATGMT TCGGCCACCC TGAGIGGAH GATTTCCCTA GGGCTGAACA ACACCTCTCT 2040 

GATGACTCAG TAATTCCCGG AAACCAAHC AGTTATGATA AATGCAGACG GAGATTTGAC 2100 

CT6GGAGATG CAGAATATTT AAGATACCGT GGGHGCAA-G AATTTGACCG GGCTATGCAG 2160 

TATCTTGAAG ATAAATATGA GITTATGACT TCAGAACACC AGTTCATATC AC6AAAGGAT 2220 

GAAGGAGATA GGATGATTGT ATTTGAAAAA GGAAACCTAG TnTTGTCTT TAATTTTCAC 2280 

TGGACAAAAA GCTATTCAGA CTATCGCATA GGCTGCCTGA AGCCTGGAAA ATACAAGGTT 2340 

GCCnGGACT CAGAT6ATCC ACTTTTTGGT 6GCTTCGGGA GAATTGATCA TAATGCCGAA 2400 

TATTTCACCT HGAAGGATG GTATGATGAT CGTCCTCGTT CAAHAIGGT GTATGCACCT 2460 

TGTAGAACAG CAGTGGTCTA TGCACTAGTA GACAAAGAAG AAGAAGAAGA AGAAGAAGAA 2520 

GAAGAAGTAG CAGTAGTAGA AGAAGTAGTA GTAGAAGAAG AATGAACGAA CTT6TG 2576 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2529 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



GGATGCTAAT 


GTTTCTGTAT 


TCHGAAAAA 


GCACTCTCTT 


TCACGGAAGA 


TCTTG6CTGA . 


60 


AAAGTCTTCT 


TACAATTCCG 


AATCCCGACC 


TTCTACAGTT 


GCAGCATCGG 


GGAAAGTCCT 


120 


TGTGCCT6GA 


AYCCAGAGTG 


ATAGCTCCTC 


ATCCTCAACA 


GACCAATTTG 


AGHCACTGA 


180 


GACATCTCCA 


GAAAATTCCC 


CAGCATCAAC 


TGATGTAGAT 


AGTTCAACAA 


TGGAACACGC 


240 


TAGCCAGAH 


AAAACTGAGA 


ACGATGACGT 


TGAGCCGTCA 


AGTGATCTTA 


CAGGAAGTGT 


300 


TGAAGAGCTG 


6AII1IGCTT 


CATCACTACA 


ACTACAAGAA 


GGTGGTAAAC 


TGGA6GAGTC 


360 


TAAAACAHA 


AATACTTCTG 


AAGAGACAAT 


TATTGATGAA 


TCTGATAGGA 


TCAGAGAGAG 


420 


GGGCATCCCT 


CCACCTGGAC 


HGGTCAGAA 


GATTTATGAA 


ATAGACCCCC 


1 1 1 IGACAAA 


480 


CTATCGTCAA 


CACCTT6ATT 


ACAGGTATTC 


ACAGTACAAG 


AAACTGAG6G 


AGGCAATTGA 


540 


CAAGTATGAG 


GGTGGTTTGG 


AAGCIIIIIC 


TCGTGGTTAT 


GAAAAAATGG 


GTTTCACTCG 


600 


TAGTGCTACA. 


GGTATCACTT 


ACC6T6AGTG 


GGCTCCTGGT 


GCCCAGTCAG 


CTGCCCTCAT 


660 


TGGAGATHC 


AACAATTGGG 


ACGCAAATGC 


TGACATTATG 


ACTCGGAATG 


AATTTGGTGT 


720 


CTGGGAGATT 


TTTCTGCCAA 


ATAATGTGGA 


TGGHCTCCT 


GCAATTCCTC 


ATGGGTCCAG 


780 



wo 96/34968 PCT/GB96/0107S 

52 

AGTGAAGATA CGYATGGACA CTCCATCAGG TGTTAAGGAT TCCATTCCTG CTTGGATCAA 840 
CTACTCTTTA CAGCTTCCTG ATGAAATTCC ATATAATGGA ATATATTATG ATCCACCCGA 900 
AGAGGAGAGG TATRTCTTCC AACACCCACG GCCAMGAAA CCAAAGTCGC TGAGAATATA 960 

•TGAATCTCAT AHGGAATGA GTAGTCCGGA GCCTAAAATT AACTCATACG TGAATTTTAG 1020 

AGATGAAGH CHCCTCGCA TAAAAAASCT TG6GTACAAT GCGGTGCAAA TTAT6GCTAT 1080 

TCAAGAGCAT TCHATTATG CTAGTTTTGG HATCAIGTC ACAAATmT TTGCACCAAG 1140 

CAGCCGITTT GGAACGCCCG ACGACCTTAA GTCTTTGATT GATAAAGCTC ATGAGCTAGG 1200 

. AATTGnGTT CTCAT6GACA HGnCACAG CCATGCATCA AATAATACTT TAGATGGACT 1260 

6AACATGTTT GACGGCACAG ATAGTTGTTA CTTTCACTCT GGAGCTCGTG GTTATCAnG 1320 

GATGTGGGAT TCCCGCCTCT TTAACTATGG AAACTGGGAG GTACTTAGGT ATCTTCTCTC 1380 

AAATGCGAGA TGGTGGTTGG ATGAGTTCAA ATTTGATG6A TTTAGATTTG AT6GTGTGAC 1440 

ATCAATGATG TATACTCACC ACGGATTATC GGTGGGAHC ACTGGGAACT ACGAGGAATA 1500 

CTTTGGACTC 6CAACTGATG TGGATGCT6T TGTGTATCTG ATGCTGGTCA ACGATCTTAT 1560 

TCACGGGCTT TTCCCAGATG CAATTACCAT T6GTGAAGAT GTTAGC6GAA TGCC6ACATT 1620 

TTGTAnCCC GTTCAA&ATG GGGGTGTTGG CTTTGACTAT CGGCTGCATA TGGCAATTGC 1680 

TGATAAATGG ATTGAGTTGC TCAAGAAACG 6GATGAGGAT TGGAGA6TGG 6TGATATTGT 1740 

TCATACACTG ACAAATA6AA GATGGTCGGA AAAGTGTGTT TCATMCGCTG AAAGTCATGA 1800 ' 

TCAA6CTCTA GTCGGTGATA AAACTATAGC ATYCTGGCTG ATGGACAAGG ATATGTATGA 1860 

tTTTATGGCT CTGGATAGAC CGYCAACAYC ATTAATAGAT CGTGGGATAG CATTGCACAA 1920 

GATGATTAGG CTTGTAACTA TGGGATTAGG AGGAGAAGGG TACCTAAATT TCATGGGAAA 1980 

TGAAHCGGC CACCCTGAGT GGAHGATH CCCTAGGGCT GARCAACACC TCTCTGATGG 2040 

CTCAGTAATT CCC6GAAACC AAnCAGHA TGATAAATGG AGACGGAGAT TTGACCTGGG 2100 

AGATGCAGAA TATTTAAGAT ACCATGGGH GCAAGAATTT GACCGGGCTA TGCAGTATCT 2160 

TGAAGATAAA TATGAGTTTA TGACTTCAGA ACACCAGTTC ATATCACGAA AGGATGAAGG 2220 

AGATAGGATG ATTGTATTTG AAARAGGAAA CCTAGTTnT GTCTTTAATT TTCACTGGAC 2280 

AAATAGCTAT TCAGACTATC GCATAGGCTG CCTGAAGCCT GGAAAATACA AGGnGGCTT 2340 

GGACTCAGAT GATCCACTTT TTGGTGGCTT CGGGAGAATT GATCATAATG CCGAATATTT 2400 

CACCTCTGAA GGATCGTATG AT6ATCGTCC TCGHCAAH ATGGTGTATG CACCTAGTAG 2460 
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AACA6CAGTG GTCTATGCAC TAGTAGACAA ANTAGAAGNA GAAGAAGAAG AAGAANCCGN 2520 
NGAA6AATT 2529 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3231 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



OA TT'I A ATA 

6ATTTMTAC 


GACTCACTAT 


AGGGAl 1 1 1 1 


1 1 1 1 1 1 1 1 1 1 


1 1 1 lAAAAAC CTCCTCCACT 


60 


CAGTCTTGGG 


ATCTCTCTCT 


CTCTTCACGC 


TTCTCTTGGG 


GCCTTGAACT CAGCAATTTG 


120 


ACACTCAGTT 


AGTTACACTC 


CTATCACTCA 


TCAGATCTCT 


Aim lie IC HAATTCCAA 


180 


CCAAGGAATG 


AATTAAAAGA 


TTAGATTTGA 


AGGAGAGAAG 


AAGAAAGATG GTGTATACAC 


240 


TCTCTGGAGT 


TCGmrCCT 


ACT6TTCCAT 


CAGTGTACAA 


ATCTAATGGA HCAGGAGTA 


300 


AT6GTGATCG 


GAGGAATGCT 


AATGTTTCTG 


TAHCHGAA 


AAAGCACTCT CTTTCACGGA 


360 


AGATCTTG6C 


TGAAAAGTCT 


TCTTACGAH 


CCGAATCCCG 


ACCTTCTACA GTTGCAGCAT 


420 


CGGG6AAAGT 


CCTTGTACCT 


GGAATCCAGA 


6TGATAGCTC 


CTCATCCTCA ACAGACCAAT 


480 


TTGAGTTCAC 


TGAGACAGCT 


CCAGAAAATT 


CCCCAGCATC 


AACT6AT6T6 GATAGTTCAA 


540 


CAATGGAACA 


CGCTAGCCAG 


ATTAAAACTG 


AGAACGATGA 


CGTTGAGCCG TCAAGTGATC 


600 


TTACAGGAAG 


TGTTGAAGAG 


HGGAI 1 1 IG 


CTTCATCACT 


ACAACTACAA GAAGGTGGTA 


660 


AACTGGAGGA 


GTCTAAAACA 


HAAATACTT 


CTGAAGAGAC 


AATTATTGAT GAATCTGATA 


720 


GGATCAGA6A 


GAGGGGCATC 


CCTCCACCTG 


GACHGGTCA 


GAAGATTTAT GAAATAGACC 


780 


CCCIIIIGAC 


AAACTATCGT 


CAACACCTTG 


ATTACAGGTA 


TTCACAGTAC AAGAAAATGA 


840 


GGGAGGCAAT 


TGACAAGTAT 


GAGGGTGGH 


TGGAAGCTTT 


TTCTCGTGGT TATGAAAAAA 


900 


TGGGTTTCAC 


TCGTAGTGCT 


ACA6GTATCA 


CTTACCGTGA 


GTGG6CTCCT GGT6CCCAGT 


960 


CAGCTGCTCT 


CAHGGAGAT 


nCAACAAH 


GGGACGCAAA 


TGCTGACAH ATGACTCGGA 


1020 


ATGAATTTGG 


TGTCTG6GAG 


AIM IILIGC 


CAAATAATGT 


GGATGGTTCT CCTGCAATTC 


1080 


CTCAT6GGTC 


CAGAGTGAAG 


ATACGCATGG 


ACACTTCATC 


AGGTCHAAG GATTCCATTC 


1140 


CTGCHGGAT 


CAACTACTCT 


HACAGCnC 


CTGATGAAAT 


TCCATATAAT GGAATATATT 


1200 


AT6ATCCACC 


CGAAGAGGAG 


AGGTATGTCT 


TCCAACACCC 


ACGGCCAAAG AAACCAAAGT 


1260 
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CGCTGAGAAT ATATGAATCT CATATTGGM TGAGTAGTCC GGAGCCTAAA ATTAACTCAT 1320 

ACGTGAATTT TAGAGATGAA GTTCTTCCTC GCAT.AAAAAA CCTTGGGTAC AATGCGGTGC 1380 

AAATTATGGC TAHCAAGAG CATTCnATT ATGCTAGTTT TGGTTATCAT GTCACAAATT 1440 

TTTTTGCACC AAGCAGCCGT TrTGGAACGC CCGACGACCT TAAGTCTTTG AHGATAAAG 1500 

CTCATGAGCT AGGAAHGIT GHCTCAIGG ACATTGTTCA CAGCCATGCA TCAAATAATA 1560 

CTTTAGATGG ACTGAACATG TTTGACGGCA CAGATAGTTG nACTTTCAC TCTGGAGCTC 1620 

GTGGTTATCA TTGGATGTGG GATTCCCGCC TCTTTAACTA TGGAAACTGG GAGGTACHA 1680 

G6TATCTTCT CTCAAATGCG A6ATGGT6GT TGGAT6AGTG CAAATTTGRT GGAITTAGAT 1740 

TTGATGGTGT 6ACATCAATG AT6TATACTC ACCAC6GATT ATCGGTGGGA TTCACTG6GA 1800 

ACTACGAGGA ATACTTTGGA CTCGCAACTG ATGTRGATGC T6CCGTGTAT CTGATGCTGG 1860 

CCAACGATCT TATTCATGG6 CTTTTCCCAG ATGCAAHAC CATT6GTGAA GAJGHAGCG 1920 

GAATGCCGAC ATTTTGTAn CCCGTTCAAG ATGGGGGTGT TGGCTTTGAC TATCGGCTGC 1980 

ATATGGCAAT TGCTGATAAA TGGATTGAGT TGCTCAAGAA ACGGGATGAG GATTGGAGAG 2040 

TGGGTGATAT TGTTCATACA CTGACAAATA GAAGATGGTC GGAAAAGTGT GTTTCATACG 2100 

CTGAAAGTCA TGATCAAGCT CTAGTC6GTG ATAAAACTAT AGCATTCTGG CTGATGGACA 2160 ■ 

AG6ATATGTA TGATTTTATG GCITTGGATA GACCGTCAAC ATCATTAATA GATCGTGGGA 2220 

TAGCATTGCA CAAGATGATT AGGCTTGTAA CTATGGGATT AGGAGGAGAA GGGTACCTAA 2280 

ATTTCATGGG AAAT6AATTC GGCCACCCTG AGTGGATTGA TTTCCCTAGG GCTGAACAAC 2340 

ACCTCTCT6A TGGCTCAGTA ATTCCCGGAA ACCAATTCAG TTATGATAAA TGCAGACGGA 2400 

GATTTGACCT GGGAGATGCA GAATATTTAA GATACCGTGG GTTGCAAGAA TTTGACCGGG 2460 

CTATGCAGTA TCTTGAAGAT AAATATGA6T mTGACTTC AGAACACCAG TTCATATCAC 2520 

GAAAGGATGA AGGAGATAGG ATGATTGTAT TTGAAAAAGG AAACCTAGTT TTTGTCTTTA 2580 

ATTTTCACTG GACAAAAAGC TAHCAGACT ATCGCATAGG CTGGCTGAAG CCTGGAAAAT 2640 

ACAAGGHGC CnGGACTCA GATGATCCAC TTTTTGGTGG CTTCGGGAGA ATTGATCATA 2700 

ATGCCGAAT6 TTrCACCTTT GAAGGATGGT ATGATGATCG TCCTCGTTCA ATTATGGTGT 2760 

ATGCACCTAG TAGAACAGCA GTGGTCTATG CACTAGTAGA CAAAGAAGAA GAAGAAGAAG 2820 

AAGTAGCAGT AGTAGAAGAA GTAGTAGTAG AAGAAGAATG AACGAACTTG TGATCGCGTT 2880 

6AAAGATTTG AACGCTACAT AGAGCTTCn GACGTATCTG GCAATAHGC ATCAGTCTTG 2940 
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GCGGAATTTC ATGTGACAAA AGGTTT6CAA TTCTiTCCAC TATTAGTAGT GCAACGATAT 3000 

AC6CAGAGAT GAAGTGCTGA ACAAACATAT GTAAAATCGA TGAATTTATG TCGAATGCTG 3060 

GGACGGGCn CAGCAGGTTT TGCTTAGTGA GTTCTGTAAA TTGTCATCTC TTTANATGTA 3120 

GAGCCCACTA GAAATCAATT ATGTGAGACC TAAAAAACAA TAACCATAAA ATGGAAATAG 3180 

TGCTGATCTA ATGATGTnT AANCCNNNNA AAAAAAAAAA AAAAACTCGA G 3231 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2578 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



TPATTAAArA 
ILMI IMAAbA 


PPAPA A ATTA 

bbAbAAAl lA 


ACTATGAuAG 


GATCTCACCA 


TCACCATCAC CATG6GATCT 


60 


Tficirxr A A A A 
1 bbt 1 uAAAA 


b rCTTCTTAC 


AATTCCGAAT 


TCCGACCTTC 


TACAGTTGCA GCATCGGGGA 


120 


AAu 1 LL 1 1 b 1 


bCCTGGAACC 


CAGAGTGATA 


GCTCCTCATC 


CTCAACAAAC CAATTTGAGT 


180 


TCACTGAGAC 


ATCTCCAGAA 


AAHCCCCAG 


CATCAACTGA 


TGTAGATAGT TCAACAATGG . 


240 


AACACGCTAG 


CCAGATTAAA 


ACTGAGAACG 


ATGACGTTGA 


GCCGTCAAGT GATCTTACAG 


300 


GAAGTGTT6A 


AGAGCTGGAT 


THGCTTCAT 


CACTACAACT 


ACAAGAAGGT GGTAAACTGG 


360 


AGGAGTCTAA 


AACATTAAAT 


ACTTCTGAAG 


AGACAAHAT 


TGATGAATCT GATAG6ATCA 


420 


GAGAGAGGGG 


CATCCCTCCA 


CCTGGACHG 


GTCAGAAGAT 


TTATGAAATA GACCCCCTTT 


480 


TGACAAACTA 


TCGTCAACAC 


CnGAHACA 


6GTATTCACA 


GTACAAGAAA CTGAGGGAGG 


540 


CAATTGACAA 


GTATGAGGGT 


GGTTTGGAAG 


CTTTTTCTCG 


TGGTTATGAA AAAATGGGH 


600 


TCACTCGTAG 


TGCTACA6GT 


ATCACTTACC 


GTGAGT6GGC 


TCCTGGTGCC CAGTCAGCTG 


660 


CCCTCATTGG 


AGATTTCAAC 


AATTGGGACG 


CAAATGCTGA 


CATTATGACT CGGAAT6AAT 


720 


TT66TGTCTG 


GGAGAI Nil 


CT6CCAAATA 


ATGTGGATGG 


nCTCCTGCA AHCCTCATG 


780 


GGTCCAGAGT 


GAAGATACGT 


ATGGACACTC 


CATCAG6TGT 


TAAGGATTCC AnCCTGCTT 


840 


GGATCAACTA 


CTCnCACAG 


CTTCCTGATG 


AAATTCCATA 


TAATGGAATA TATTATGATC 


900 


CACCCGAAGA 


GGAGAGGTAT 


ATCTTCCAAC 


ACCCACGGCC 


AAAGAAACCA AAGTCGCTGA 


960 


GAATATATGA 


ATCTCATAH 


GGAATGAGTA 


GTCCGGAGCC 


TAAAAHAAC TCATACGT6A 


1020 


ATTTTAGAGA 


TGAAGTTCn 


CCTCGCATAA 


AAAAGCTT66 


6TACAAT6CG GTGCAAATTA 


1080 
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TGGCTATTCA AGAGCATTCT TATTATGCTA GTTTTGGnA TCATGTCACA A AI 1 1 1 1 1 IG 1140 

CACCAAGCAG CCGTTTTGGA ACGCCCGACG ACCTTAAGTC nTGATTGAT AAAGCTCATG 1200 

A6CTAGGAAT TGTTGTTCTC ATGGACAHG TTCACAGCCA TGCATCAAAT AATACTTTAG 1260 

ATGGACT6AA CATGrriGAC GGCACCGATA GnGnACTT TCACTCTG6A GCTCGTGGTT 1320 

ATCATTGGAT GTGG6ATTCC CGCCTTTTTA ACTATG6AAA CTG6GAGGTA CTTAG6TATC 1380 

TTCTCTCAAA T6CGA6AT6G TGGTTGGATG AGTTCAAATT TGATGGATTT AGATTTGATG 1440 

GTGTGACATC AATGATGTAT ACTCACCACG GATTATCGGT GGGAHCACT GGGAACTACG 1500 

AGGAATACTT TGGACTCGCA ACTGATGTGG ATGCTGTTGT GTATCTGATG CTGGTCAACG 1560 

ATcmrrcA rGGGcnrrc ccagatgcaa ttaccattgg tgaagatgh agcggaatgc i620 

CGACATTTTG TATTCCCGIT CAAGATGGGG GTGTTGGCTT TGACTATCG6 CTGCATATGG 1680 

CAATT6CTGA TAAATGGATT GAGTTGCTCA AGAAACGGGA TGAGGATTGG AGAGTGGGTG 1740 

ATAHGTrCA TACACTGACA AATAGAAGAT GGTCGGAAAA GTGT6TTTCA TACGCT6AAA 1800 

GTCATGATCA AGCTCTAGTC GGTGATAAAA CTATAGCATT CTGGCTGATG GACAAGGATA 1860 

TGTATGATTT TATGGCTCTG GATAGACCGC CAACATCATT AATAGATCGT GGGATA6CAT 1920 

TGCACAAGAT GATTAGGCTT GTAACTATGG GATTAGGAGG AGAAGGGTAC CTAAATTTCA 1980 

TGGGAAATGA AHCGGCCAC CCTGAGTGGA TTGATTTCCC TAGGGCTGAA CAACACCTCT 2040 

CTGATGACTC AGTAATTCCC GGAAACCAAT TCAGHATGA TAAATGCAGA CGGAGAfTTG 2100 

ACCTGGGAGA TGCAGAATAT TTAAGATACC GTGGGTTGCA AGAATTTGAC CGGGCTATGC 2160 

AGTATCTTGA AGATAAATAT GAGTTTATGA CTTCAGAACA CCAGHCATA TCACGAAAGG 2220 

ATGAAGGAGA TAGGATGAH GTATTTGAAA AAGGAAACCT AGTmTGTC TrTAATTTTC 2280 

ACTGGACAAA AAGCTATTCA GACTATCGCA TAGGCTGCCT GAAGCCTGGA AAATACAAGG 2340 

TTGCCTTGGA CTCAGATGAT CCACTnTTG GTGGCTTCGG 6AGAATTGAT CATAAT6CCG 2400 

AATATTTCAC CTTTGAAGGA TGGTATGATG ATCGTCCTCG HCAATTATG GTGTATGCAC 2460 

CHGTAGAAC AGCAGTGGTC TAT6CACTAG TAGACAAAGA AGAAGAAGAA GAA6AAGAA6 • 2520 

AAGAAGAAGT AGCAGTAGTA GAAGAAGTAG TAGTAGAAGA AGAATGAACG AACHCTG 2578 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
AATTTYATGG GNAAYGARTT YGG 
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CLAIMS 

1. Starch extracted from a potato plant and having an amy lose content of at least 35%, 
as judged by the iodometric assay method of Morrison & Laignelet (1983 J. Cereal 
Science /, 9-20). 

2. Starch according to claim 1, having an amylose content of at least 37%, as judged by 
the method defined in claim 1. 

3. Starch according to claim 1, having an amylose content of at least 40%, as judged by 
the method defined in claim 1. 

4. Starch according to claim 1, having an amylose content of at least 50%, as judged by 
the method defined in claim 1. 

5. Starch according to claim 1 , having an amylose content of at least 66 % , as judged by 
the method defined in claim 1. 

6. Starch according to any one of claims 1-5, having an amylose content of 35 - 66%, 
as judged by the method defined in claim L 

7. Starch which as extracted from a potato plant by wet milling at ambient temperature 
has a viscosity onset temperature in the range 70 - 95*'C, as judged by viscoamylograph 
of a 10% w/w aqueous suspension thereof, performed at atmospheric pressure using the 
Newport Scientific Rapid Visco Analyser 3C with a heatmg profile of holding at 50°C for 
2 minutes, heating from 50 to 95°C at a rate of l.S^'C per minute, holding at 95*'C for 15 
minutes, cooling from 95 to 50°C at a rate of 1.5''C per minute, and then holding at 50^C 
for 15 minutes. 

8. Starch which as extracted from a potato plant by wet milling at ambient temperature 
has peak viscosity in the range 500 - 12 stirring number units (SNUs), as judged by 
viscoamylograph conducted according to the protocol defined in claim 7. 
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9. Starch which as extracted from a potato plant by wet milling at ambient temperature 
has a pasting viscosity in the range 214 - 434 SNUs, as judged by viscoamylograph 
conducted according to the protocol defined in claim 7. 

10. Starch which as extracted from a potato plant by wet milling at ambient temperamre 
has a set-back viscosity in die range 450 - 618 SNUs, as judged by viscoamylograph 
conducted according to the protocol defined in claim 7. 

11. Starch which as extracted from a potato plant by wet milling at ambient temperature 
has a set-back viscosity in the range 14 - 192 SNUs, as judged by viscoamylograph 
conducted according to the protocol defined in claim 7. 

12. Starch which as extracted from a potato plant by wet milling at ambient temperature 
has a peak viscosity in the range 200 - 500 SNUs and a set-back viscosity in the range 
275-618 SNUs as judged by viscoamylograph according to the protocol defined in claim 
7. 



13. Starch which as extracted from a potato plant by wet milling at ambient temperature 
has a viscosity which does not decrease between the start of the heating phase (step 2) and 
the start of the final holding phase (step 5) and has a set-back viscosity of 303 SNUs or 
less as judged by viscoamylograph according to the protocol defined m claim 7. 

14. Starch which as extracted from a potato plant by wet milling at ambient temperature 
displays no significant increase in viscosity as judged by viscoamylograph conducted 
according to the protocol defined in claim 7. 

15.. Starch which as extracted from a potato plant by wet milling at ambient temperamre, 
is in accordance with claim 7 and in accordance with any one of clahns 8 to 14. 

16. Starch according to any one of claims 7 to 15, having an amylose content in the range 
35 - 66%, as judged by the method of Morrison & Laignelet defined in claim 1. 
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17. Starch which as extracted from a potato plant, has a phosphorus content in excess of 
200mg/100grams dry weight starch. 

18. Starch according to claim 17, having a phosphorus content in the range 200 - 
240mg/100grams dry weight starch. 

19. Starch according to claim 17 or 18, further in accordance with any one of claims 1 
to 16. 

20. Starch prepared by physical, chemical and/or enzymatic treatment of a starch initially 
having properties in accordance with any one of claims 1-19. 

21. Starch according to claim 20, being resistant starch prepared by physical, chemical 
and/or enzymatic treatment of a starch initially having properties in accordance with any 
one of claims 1-19. 

22. Starch according to claim 21, comprising in excess of 5% total dietary fibre, as 
determined according to the method of Prosky et aL, (1985 J. Assoc. Off. Anal. Chein. 
68, 677). 

23. Use of starch according to any one of claims 1-22 in the preparation or processing 
of a foodstuff. 

24. Use of starch according to claim 23, wherein the starch is used to provide a film , 
barrier, coating or as a gelling agent. 

25. Use of starch according to claim 23, to prepare resistant starch compositions. 

26. Use of starch according to any one of claims 1-22 in the preparation or processing 
of corrugating adhesives, biodegradable products, packaging, glass fibers and textiles. 

27. A nucleotide sequence encoding an effective portion of a class A starch branching 
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enzyme (SBE) obtainable from potato plants. 



PCT/GB96/01075 



28. A nucleotide sequence according to claim 27, encoding a polypeptide comprising 
substantially the amino acid sequence of residues 49 to 882 of the sequence shown in 
Figure 5, 

29. A nucleotide sequence according to claim 27 or 28, comprising substantially the 
sequence of nucleotides 289 to 2790 of the sequence shown in Figure 5, or a functional 
equivalent thereof. 

30. A nucleotide sequence according to claim 29, further comprising the sequence of 
nucleotides 145 to 288 of the sequence shown in Figure 5, or a functional equivalent 
thereof. 

31. A nucleotide sequence according to claim 27, comprising the sequence of nucleotides 
228 to 2855 of the sequence labelled psbe2con.seq in Figure 8, or a functional equivalent 
thereof. 

32. A nucleotide sequence according to claim 27, comprising the sequence of nucleotides 
57 to 2564 of the sequence labelled as psbe2con.seq in Figure 12, or a functional 
equivalent thereof. 

33. A nucleotide sequence according to any one of claims 27 to 32, comprising an in- 
frame ATG start codon, and optionally including a 5' and/or a 3' untranslated region. 

34. A nucleotide sequence according to claim 27, comprismg the sequence of nucleotides 
45 to 3200 of the sequence labelled as psbe2con.seq in Figure 8, or a functional equivalent 
thereof. 

35. A nucleic acid construct comprising a sequence in accordance with any one of claims 
27 to 34. 
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36. An expression vector comprising a nucleic acid construct according to claim 35. 



37. A host cell into which has been introduced a sequence in accordance with any one 
of claims 27 to 36. 

38. An effective portion of a class A SBE polypeptide obtainable from potato plants and 
encoded by a nucleotide sequence in accordance with any one of claims 27 to 36. 

39. A polypeptide according to claim 38, comprising substantially the sequence of amino 
acids 49 to 882 of the sequence shown in Figure 5, or a functional equivalent thereof. 

40. A polypeptide according to claim 38 or 39, comprising the sequence of amino acids 
1 to 48 of the sequence shown in Figure 5. 

41. A polypeptide in accordance with any one of claims 38, 39 or 40 in substantial 
isolation firom other plant-derived constituents. 

42. A method of altering the characteristics of a plant, comprising introducing into the 
plant a portion of a nucleotide sequence in accordance with any one of claims 27 to 36, 
operably linked to a suitable promoter active in the plant, so as to affect the expression 
of a gene present in the plant. 

43. A method according to claim 42, wherein the nucleotide sequence is operably linked 
in the anti-sense orientation to a suitable promoter active in the plant. 

44. A method according to claim 42, wherein the introduced sequence comprises one or 
more of the following operably linked in the sense orientation to a promoter active in the 
plant, so as to cause sense suppression of an enzyme naturally expressed in the plant: a 
5' untranslated region, a 3' untranslated region, or a codmg region of the potato SBE class 
A starch branching enzyme. 



45. A method according to any one of claims 42, 43 or 44, further comprising 
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introducing into the plant one or more further sequences. 

46. A method according to claim 45, wherein one or more of the further sequences are 
operably linked in the anti-sense orientation to a suitable promoter active in the plant. 

47. A method according to claim 45 or 46. wherein the further sequence comprises a 
ponion of a class B SBE nucleotide sequence. 

48. A method accordmg to any one of claims 42 to 47, effective in altering the starch 
composition of a plant. 

49. A plant or plant cell having characteristics altered by the method of any one of claims 
42 to 48, or the progeny of such a plant, or part of such a plant. 

50. A plant according to claim 49, selected from one of the following: potato, pea, 
tomato, maize, wheat, rice, barley, sweet potato, and cassava. 

51. A tuber or other storage organ from a plant according to claim 49 or 50. 

52. Use of a mber or other storage organ according to claim 5 1 , in the preparation and/or 
processing of a foodstuff. 

53. A plant according to claim 49 or 50, containing starch which, as extracted from the 
plant by wet milling at ambient temperature, has an elevated viscosity onset temperature 
as judged by viscoamylograph condticted according to the protocol defined in claim 7, 
compared to starch extracted from a shnilar, but unaltered, plant. 

54. A plant accordmg to claim 53, wherein the viscosity onset temperamre is elevated by 
an amount in the range of 10 to 25°C. 

55. A plant according to claim 49 or 50, containing starch which, as extracted from the 
plant by wet milling at ambient temperamre, has a decreased peak viscosity as judged by 
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viscoamylograph conducted according to the protocol defined in claim 7, compared to 
starch extracted from a similar, but unaltered, plant. 

56. A plant according to claim 55, wherein the peak viscosity is decreased by an amount 
in the range of 240 to 700 SNUs. 

57. A plant according to claim 49 or 50. containing starch which, as extracted from the 
plant by wet milling at ambient temperaoire. has an increased pasting viscosity as judged 
by viscoamylograph conducted according to the protocol defined in claim 7, compared to 
starch extracted from a similar, but unaltered, plant. 

58. A plant according to claim 57, wherein the pasting viscosity is increased by an 
amount in the range of 37 to 260 SNUs. 

59. A plant according to claim 49 or 50, containing starch which, as extracted fix)m the 
plant by wet milling at ambient temperanire, has an increased set-back viscosity as judged 
by viscoamylograph conducted according to the protocol defined in claim 7, compared to 
starch extracted firom a similar, but unaltered, plant. 

60. A plant accordmg to claim 59, wherein the set-back viscosity is increased by an 
amount in the range of 224 to 313 SNUs. 

61. A plant according to claun 49 or 50, containing starch which, as extracted from the 
plant by wet milling at ambient temperature, has a decreased set-back viscosity as judged 
by viscoamylograph conducted according to the protocol defined in claim 7, compared to 
starch extracted from a similar, but unaltered, plant. 

62. A plant according to claim 49 or 50, containing starch which, as extracted from the 
plant by wet milling at ambient temperature, has an elevated apparent amylose content as 
judged by iodometric assay according to the method of Morrison & Laignelet, compared 
to starch extracted from a similar, but unaltered, plant. 
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63. A plant according to claim 49 or 50, containing starch which, as extracted from the 
plant, has a phosphorus content in excess of 200mg/100grams dry weight starch. 

64. Starch obtainable from a plant according to any one of claims 49, 50 or 53 -63. 

65. Starch according to clahn 64 and further in accordance widi any one of claims 1 - 22. 

66. A method of modifying starch in vitro, comprising treating starch under suitable 
conditions with an effective amount of a polypeptide in accordance with any one of claims 
38 to 41. 

67. A potato plant or part thereof which, in its wild type possesses an effective SBE A 
gene, but which plant has been altered such that there is no effective expression of an SBE 
A polypeptide within the cells of at least part of the plant. 

68. A potato plant according to claim 67, wherein the alteration is effected by a method 
according to any one of clainis 42-48. 
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TTGATGGGGCCTTGAACTCAGCAATTTGACACTCAGTTAGTTACA^ 

' ' ' ' ■ — — 1- — — —i 1 . . . . I . . . I , , , I 

AACTACCCCGGAACTT6AGTCGTTAAACTGTGA6TCAATCAATGT 
AAGGAATGAATAAAAGGATAGATTTGTAAAAACCCTAAGGAGAGA 



TTCCTTACTTATTTTCCTATCTAAACATTTTTGGGATTCCTCTCT 
M N K R I D L 

GTTCCATCAGTGTACAAATCTAATGGATTCAGCAGTAATGGTGAT 
' : — H ' H 1 1 1 1 H 

CAAGGTAGTCACATGTTTAGATTACCTAAGTCGTCATTACCACTA 
VPSVYKSNGFSSNGD 

Pgl II ^EcoR I 

TCACGGAAGATCTTGGCTGAAAAGTCTTCTTACAATTCCGAATTC 
1 1 ' I ' ■ ■ I 1 1 1 h 

AGTGCCTTCTAGAACCGACTTTTCAGAAGAATGTTAAGGCTTAAG 

SRK I LAEK SSYNSEF 

ACCCAGAGTGATAGCTCCTCATCCTCAACAGACCAATTTGAGTTC 

^ H 1 H- 1 1 , — ^. 

TGGGTCTCACTATCGAGGAGTAGGAGTTGTCTGGTTAAACTCAAG 
TQSDSSSS ST DQ.FEF 

AGTTCAACAATGGAACACGCTAGCCAGATTAAAACTGAGAACGAT 
' 1— I 1 1-^ 1 , 1 ^ 

TCAAGTTGTTACCTTGTGCGATCGGTCTAATTTTGACTCTTGCTA 
SSTMEHASQIKTENO 

GATTTTGCTTCATCACTACAACTACAAGAAGGTGGTAAACTGGAG 
H- 1 H ^ 1 

ctaaaacgaagtagtgatgttgatgttcttccaccatttgacctc 
dfasslqlqeggkleJ 
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pglll 

CTCCTATCACTTATCAGATCTCTATTTTTTCTCTTAATTCCAACC 
1 H ' 1 I ■ ■ I 90 

GAGGATAGTGAATAGTCTAGAGATAAAAAAGAGAATTAAGGTTGG 

AGAAGAAAGATGGTGTATACACTCTCTGGAGTTCGTTTTCCTACT 
' 1 1— i 1 ■ ' ■ ' ' I ' ' ■ ' I f- 180 

TCTTCTTTCTACCACATATGT6AGAGACCTCAAGCAAAAGGATGA 
MVYTLSGVRFPT 

CGGAGGAATGCTAATGTTTCTGTATTCTTGAAAAAGCACTCTCTT 
1 1 1 1 1 i i ^ [. 270 

GCCTCCTTACGATTACAAAGACATAAGAACTTTTTCGTGAGAGAA 
RRNANVSVFLK KHSL 



CGACCTTCTACAGTTGCAGCATCGGGGAAAGTCCTTGTGCCTGGA 

" H 1 — — ^ 1 H — ' ' I . I . . I 360 

GCTGGAAGATGTCAACGTCGTAGCCCCTTTCAGGAACACGGACCT 
RPSTV.A. ASG KVL V PG 

ACTGAGACATCTCCAGAAAATTCCCCAGCATCAACTGATGTAGAT 

1 ' I 1 > h 1 H- 450 

TGACTCTGTAGAGGTCTTTTAAGGGGTCGTAGTTGACTACATCTA 

TETSPENS PASTDVD 

GACGTTGAGCCGTCAAGTGATCTTACAGGAAGTGTTGAAGAGCTG 
' 1 ' I 1— HH H 

CTGCAACTCGGCAGTTCACTAGAATGTCCTTCACAACTTCTCGAC 
OVEPS SDLTGSV EEL 

GAGTCTAAAACATTAAATACTTCTGAAGAGACAATTATTGATGAA 

1 ' ' ' ' ' I ' ' I ' I .... I 630 

CTCAGATTTTGTAATTTATGAAGACTTCTCTGTTAATAACTACTT 

ES.KTLN. TSEET I I DE 
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.ctgataggatcagagagaggggcatccctccacctggacttggt'^ 

' i H , , I .... I H 

AGACTATCCTAGTCTCTCTCCCCGTAGGGAGGTGGACCTGAACCA 
SDRI RERGIPPPGLG 

CACCTTGATTACAGGTATTCACAGTACAAGAAACTGAGGGAGGCA 
' 1 ' ■ ■ ' 1 -I 1 I I , I 

GTGGAACTAATGTCCATAAGTGTCATGTTCTTTGACTCCCTCCGT 
"LDYRYSQYKKL .REA 



GAAAAAATGG GTTTCACTCGTAGTGCTACAGGTATCACTTACCGT 

""""^ ' ' ' I 1 1 1- 1 

CTTTTTTACCCAAAGTGAGCATCACGATGTCCATAGTGAATGGCA 
EKMG FT RSATGITYR 

AACAATTG6 GACGCAAATGCTGACATTATGACTCGGAATGAATTT 

' ' I I I I I I I ' . I I . I 

TTGTTAACCCTGCGTTTACGACTGTAATACTGAGCCTTACTTAAA 
NNWDANADIMT R NEF 

GCAATTCCTCA TGGGTCCAGAGTGAAGATACGTAT6GACACTCCA 

' ' ' • ' ' I I ' I . 1 I , I 

CGTTAAGGAGTACCCAGGTCTCACTTCTATGCATACCTGTGAGGT 
'^IPHGS RVKIRMDTpJ 
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,Hlnc II 

CAGAAGAT TTATGAAATAGACCCCCTTTTGACAAACTATCGTCAA 

GTCTTCTAAATACTTTATCTG6GGGAAAACTGTTTGATAGCAGTT 
Q K I Y E I- 0 P L L T N Y R Q 

ATTGACAAGTATGAGGGTGGTTTGGAAGCCTTTTCTCGTGGTTA T 

TAACTGTTCATACTCCCACCAAACCTTCGGAAAAGAGCACCAATA 
^DKYEGGLEAFSRGY 

.Pvu II 

GAGTGGGCTCTTGGTGCCCAGTCAGCTGCCCTCATTGGAGATTT C 

CTCACCCGAGAACCACGGGTCAGTCGACGGGAGTAACCTCTAAAG 
EWALG A QSAAL I GDF 

GGTGTCTGGGAGATTTTTCTGCCAAATAATGTGGATGGTTCTCC T 

CCACAGACCCTCTAAAAAGACGGTTTATTACACCTACCAAGAGGA 
GY WEIFLPNNVDGSP 
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CAGCTTCCTGATGAAATTCCATATAATGGAATACATTATGATCCA^ 
' H H ' ' I ' I ' r I . I 

GTCGAAGGACTACTTTAAGGTATATTACCTTATGTAATACTAGGT 
QLPDEIPYNGIHYDP 



CCAAAGTCGCTGAGAATATATGAATCTCATATTGGAATGAGTAGT 

H — ' ■ I ' ' ' ■ I 1 .... I 

GGTTTCAGCGACTCTTATATACTTAGAGTATAACCTTACTCATCA 
PKSLRIYESHIGMSS 



.HinD III 

CTTCCTCGCATAAAAAAGCTT.GGGTACAATGCGCTGCAAATTATG 

' -f- I 1 ■ ' ' I h 

GAAGGAGCGTATTTTTTCGAACCCATGTTACGCGAGGTTTAATAC 
LPRIK KLGYNALQI M 

ACAAATTTTTTTGCACCAAGCAGCCGTTTTGGAACGCCCGAGGAC 

• ■■ — I ' ' I . ' ' I I I ... I n I ■ , I K 

TGTTTAAAAAAACGTGGTTCGTCGGCAAAACCTTGCGGGCTGCtG 
TNF FA PSSRF GTP DD 

CTCATGGACATTGTTCACAGCCATGCATCAAATAATACTTTAGAT 

-—^^ H I ' — I 1 — \^ H 

GAGTACCTGTAACAAGTGTCGGTACGTAGTTTATTATGAAATCTA 
LMDI VHS HASNNTLD^ 



^Sheet 
6 



Fig. 5 SHEET 5 



SUBSTTTUTE SHEET (RULE 25) 



W09CA34968 



PCT/GB96rai075 



13/75 



CCCGAAGAGGAGAGGTATATCTTCCAACACCCACGGCCAAAGAAA 
1 h — -—I 1 ' ' ■ ■ I 1— H 1- 1 170 

GGGCTTCTCCTCTCCATATAGAAGGTTGTGGGTGCCGGTTTCTTT 
PEEE.RYIFQ HPR PK K 



CCGGAGCCTAAAATTAACTCATACGTGAATTTTAGAGATGAAGTT 

— h— H 1 I . . I I , I , I , , I , — ^ — , }. 1260 

GGCCTCGGATTTTAATTGAGTATGCACTTAAAATCTCTACTTCAA 
PEPK I NSYVNFR DEV 



GCTATTCAAGAGCATTCTTATTACGCTAGTTTTGGTTATCATGTC 
1 \ 1 ' ' ' ■ ' h 1 — — 1 1- f- 1350 

CGATAAGTTCTCGTAAGAATAATGCGATCAAAACCAATAGTACAG 
AIG E HSYYAS FGYHV 



CTTAAGTCTTTGATTGATAAAGCTCATGAGCTAGGAATTGTTGTT 

' ' ' ' ' ' ' I I I I 1440 

GAATTCAGAAACTA.ACTATTTCGAGTACTCGATCCTTAACAACAA 

LKSLIDKAHELGIVV 

GGACTGAACATGTTTGAC.TGCACCGATAGTTGTTACTTTCACTCT 

H I I ' 1- 1530 

CCTGACTTGTACAAACTGACGTGGCTATCAACAATGAAAGTGAGA 

GLNMFDCTDSCY FHS 
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CCCGAAGAGGAGAGGTATATCTTCCAACACCCACGGCCAAAGAAA 
1 1 1 I 1 1 h 1170 

GGGCTTCTCCTCTCCATATAGAAGGTTGTGGGTGCCGGTTTCTTT 
P E E E R Y I F Q H P R P K K 



CCGGAGCCTAAAATTAACTCATACGTGAATTTTAGAGATGAAGTT 



PEPKI N SYVNFR DEV 



GCTATTCAAGAGCATTCTTATTACGCTAGTTTTGGTTATCATGTC 

— — I 1 1 ' ■ ' ■ I h— 1 i — \ 1- 1350 

CGATAAGTTCTCGTAAGAATAATGCGATCAAAACCAATAGTACAG 
AIQEHSYYAS FGYHV 

CTTAAGTCTTTGATTGATAAAGCTCATGAGCTAGGAATTGTTGTT 



GAATTCAGAAACTA.ACTATTTCGAGTACTCGATCCTTAACAACAA 
LKSLIDKAHELGIVV 

GGACTGAACATGTTTGACTGCACCGATAGTTGTTACTTTCACTCT 

H ' I I I I I , I , , , I 1 [ , I — ^ 1530 

CCTGACTTGTACAAACTGACGTGGCTATCAACAATGAAAGTGAGA 
GLNMFDCTDSCY FHS 
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TATGGAAACTGGG AGGTACTTAGGTATCTTCTCTCAAATGCGAGA 

ATACCTTTGACCCTCCATGAATCCATAGAAGAGAGTTTACGCTCT 
YGNWEVLRYLLSNAR 

GTGACATCAATGATGTATATTCACCACGGATTATCGGTGGGATTC 

CACTGTAGTTACTACATATAAGTGGTGCCTAATAGCCACCCTAAG ^'^^^ 
VTSMMYIHHGLSVGF 



GCTGTTGTGTATCTGATGCTGGTCAACGATCTTATTCATGGGCTT 



Hinc II 

TCTTATTCATGGGCTT 

1800 



CGACAACACATAGACTACGACCAGTTGCTAGAATAAGTACCCGAA 
AVVYLMLVNDLIHGL 

ACATTTTGTATT CCCGTCCAAGAGGGGGGTGTTGGCTTTGACTAT 

' ■ ' I — ^ — I ' ■ ■ ■ I —I ^ 1890 

TGTAAAACATAAGGGCAGGTTCTCCCCCCACAACCGAAACTGATA 
'I^CIPVQEGGVGFDY 

AAACGGGATGAGG ATTGGAGAGTGGGTGATATTGTTCATACACTG 

' ' I I ■ I I I I , I I I . .. I iggn 

TTTGCCCTACTCCTAACCTCTCACCCACTATAACAAGTATGTGAC 
KRDEDWRVGDIVHTL 

CATGATCAAGC TCTAGTCGGTGATAAAACTATAGCATTCTGGCTG 

' I I — H 1 1 J. 2070 

GTACTAGTTCGAGATCAGCCACTATTTTGATATCGTAAGACCGAC 
HDQALVGDKTIAFWL 
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Hinc II 



ATGGACAAGGATATGTATGATTTTATGGCTCTGGATAGACCGTCA*^ 
' 1 1 1 1 i 1 1 K 

TACCTGTTCCTATACATACTAAAATACCGAGACCTATCTGGCAGT 
"DKDMYDFMALDRPS 



Asp 718 
/<pnl 

CTTGTAACTATGGGATTAGGAGGAGAAGGGTACCTAAATTTCATG 
' I I I I 1 h 

GAACATTGATACCCTAATCCTCCTCTTCCCATGGATTTAAAGTAC 
!-VTM.GLGGEGYL .NF M 

GAACAACACCTCTCTGATGGCTCAGTAATCCCCGGAAACCAATTC 
' 1 f H- — 1 1 1 K 

CTTGTTGTGGAGAGACTACCGAGTCATTAGGGGCGTTTGGTTAAG 
EQHLSDG SVIPGNQF. 

^Sspl 

TATTTAAGATACCGTGG6TTGCAAGAATTTGACCGGCCTATGCAG 
— ' 1 ' I ' . . I I . . I H — 

ATAAATTCTATGGCACCCAAC6TTCTTAAACTGGCGGGATACGTC 

YLR.YRGLQEFDRPMQ 

ATATCACGAAAGGATGAAGGAGATAGGATGATTGTATTTGAAAAA 
— — ^ 1- ' I ' I ' 1 1 — I — M 1 -+ 

TATAGTGCTTTCCTACTTCCTCTATCCTACTAAGATAAACTTTTT 
ISRKD. EGDRMIV FEK 

TCAGACTATCGCATAGCCTGCCTGAAGCCTGGAAAATACAA6GTT 
1 1 1 1 1 1 1 H 

AGTCTGATAGCGTATCGGAC6GACTTCGGACCTTTTATGTTCCAA 
SDYRIA C. LKPGKYKV^' 
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ACATCATTAATA6ATCGTGGGATAGCATT6CACAAGATGAT TAGG 

TGTAGTAATTATCTAGCACCCTATCGTAACGTGTTCTACTAATCC 
^SLI .DRGIALHK MI R 



.EcoR I 

GGAAATGAATTCGGCCA CCCTGAGTGGATTGATTTCCrTAnQCPT 



CCTTTACTTAAGCCGGTGGGACTCACCTAACTAAAGGGATCCCGA 
GNEFGHP E WIDFPRA 

AGTTATGATAAATGCAGACGGAGATTTGACCTGGGAGATGCAG AA 

TCAATACTATTTACGTCTGCCTCTAAACTGGACCCTCTACGTCTT ^^'^^ 
2YDKCRR RF DLGDAE 



TATCTTGAAGATAAATATGAGTTTAT6ACTTCAGAACACCA6T TC 

ATAGAACTTCTATTTATACTCAAATACTGAAGTCTTGTGGTCAAG 
YLE D K YEFMTSEHQF 



GGAAACCTAGTTTTTGTCTTTAATTTTCACTGGACAAAAAGCTAT 

CCTTTGGATCAAAAACAGAAATTAAAAGTGACCTGTTTTTCGATA ^^^^ 
G N L V F V F N F H W T K. S Y . 

GCCTTGGACTCAGATGATCCACTTTTTGGTGGCTTCGGGAGAAT T 

CGGAACCTGAGTCTACTAGGTGAAAAACCACCGAAGCCCTCTTAA ' 
ALOSDDPLFG G FGR I 
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,Ssp i 

gatcataatgccgaat'atttcacctttgaaggatggtatgatgat"^ 

— — < - I I I .. .I II . I .... I 

CTAGTATTACGGCTTATAAAGTGGAAACtTCCTACCATACTACTA 
DHNAEY FTFEGWYDD 

GTCTATGCA.CTAGTA6ACAAAGAAGAAGAAGAAGAAGAAGAAGAA 
' I ' ' ■ t H H H — I 1 . I . . I I I 

CAGATACGTGATCATCTGTTTCTTCTTCTTCTTCTTCTTCTTCTT 
VYALVD KEE E EEEEE 

TGAACGAACTTGTGATCGCGTTGAAAGATTTGAACGCTACATAGA 

' h- I 1 I ' ' I . . I . . — 

ACTTGCTTGAACACTAGCGCAACTTTCTAAACTTGCGATGTATCT 

TCATGTGACACAAGGTTTGCAATTCTTTCCACTATTAGTAGTGCA 
' 1 -I 1 ^ H 1 1 -4- 

AGTACACTGTGTTCCAAACGTTAAGAAAGGTGATAATCATCACGT 

EcoR I Pst I 

GATGAATTTATGTCGAATGCTGGGACGATCGAATTCCTGCAGGCC 

' ' I -I 1 — -H ' r ' I i . . I 

CTACTTAAATACAGCTTACGACCCTGCTAGCTTAAGGACGTCCGG, 

Fig. 5 SHEET 11 
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CGTCCTCGTTCAATTATGGTGTATGCACCTTGTAAAACAGCAGTG 

1 1 I I I I I I I , , . I ■ ■ . , I 2700 

GCAGGAGCAAGTTAATACCACATACGTGGAACATTTTGTCGTCAC 
RPRSIMVYAP .CKTAV 

GAAGAAGAAGTAGCAGCAGTAGAAGAAGTAGTAGTAGAAGAAGAA 

1 ■ ' ' I ' ' ' I I . . i I I I , . , I , . ■ ■ I 2790 

CTTCTTCTTCATCGTCGTCATCTTCTTCATCATGATCTTCTTCTT 
EEEVA.AVEEVVV EEE 



,Sspl 

GCTTCTTGACGTATCTGGCAATATTGCATCAGTCTTGGCGGAATT 

' • ' ' I J I I ' I 2880 

CGAAGAACTGCATAGACCGTTATAACGTAGTCAGAACCGCGTTAA 



pla I 

ACGATATACGCAGAGATGAAGTGCTGAACAAACATATGTAAAATC . 
1— — M 1 I I ... I ■■ I I I I , , I 2970 

TGCTATATGCGTCTCTACTTCACGAeTTGTTTGTATACATTTTAG 



GGGGGACCCCTTAGTTCT 
\ • i — ^ 3033 



CCCCCTGGGGAATCAAGA 

Fig. 5 SHEET 12 
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^180 fl90 ^200 f210 ^220 

1 YE I DPLLTNYRQHLDYRYSQYKKLREA I DKYEGGLEAFSRGYEKMGFTR 

: : : DP L Y : H: .R.:Y. : I: KYEG LE. F: : GY K. GF. R 

LLNLDPTLEPYLDHFRHRMKRYVDQKMLIEKYE6PLEEFAQGYLKFGFNR 

^•100 ♦■110 *-120 *-130 *-H0 

f230 f240 ^250 f260 ^270 

SATG I TYREWALGAQSAAL I GDFNNWDANAD I MTRNEFGVWE 1 FLPNNVD 
. . . 1. YREWA : AQ. A. : IGDFN. W: : : . : : M. : : : FGVW. 1 : P: VD 
EDGC I VYREWAPAAQEAEV IGDFNGWNGSNHMMEKDQFGVWS IR IPD-VD 

^150 *-160 *-170 *-180 -^190 

f280 ^290 ^300 f310 ^320 

GSPA I PHGSRVK I RMDTPSGV-KDS I PAW I NYSLQLPDE I —PYNG I HYD 
: . P. IPH. SRVK: R. . : GV D. IPAWI: Y: . PY: G: . . D 

SKPVIPHNSRVKFRFKHGNGVWVDRIPAWIKYATADATKFAAPYDGVYWD 

^200 ^210 *-220 *-230 *-240 

f330 ^340 ^350 ^360 ^370 

PPEEERYIFQHPRPKKPKSLRIYESHIGMSSPEPKINSYVNFRDEVLPRl 
PP . ERY F: . PRP KP: : RIYE: H: GMSS: EP: : NSY : F D: VLPRI 
PPPSERYHFKYPRPPKPRAPRI YEAHVGMSSSEPRVNSYREFADDVLPRl 

'^250 ♦•260 ♦■270 ♦■280 ^^290 

^380 f390 f^400 ^410 fa20 

KKLGYNALQIMAIQEHSYYASFGYHVTNFFAPSSRFGTPDDLKSLIDKAH 
K .YN::Q:MAI EHSYY: SF6YHVTNFFA S: R: G. P: DLK LIDKAH 
KANNYNTVQLMAIMEHSYYGSFGYHVTNFFAVSNRYGNPEDLKYLIDKAH 

♦■300 ♦•SIO ♦•320 ♦•330 ♦■S^O 

^430 f440 f450 f460 ^470 

EL6 I VVLMD I VHSHASNNTLD6LNMFDC---TDSCYFHSGARGYHWMWDS 
. LG: VL: D: VHSHASNN. DGLN FD : : . . YFH: G. RGYH : WDS 
SLGLQVLVDVVHSHASNNVTDGLNGFDIGQGSQESYFHAGERGYHKLWDS 

♦•350 ♦•360 ♦•370 ♦■380 ♦■390 

^480 ^490 ,r500 ^510 ^520 

RLFNYGNWEVLRYLLSNARWWLDAFKFDGFRFDGVTSMMYIHHGLSVGFT 
RLFNY: NWEVLR: LLSN RWWL: . : : FDGFRFDG: TSM: Y: HHG: : : GFT 
RLFNYANWEVLRFLLSNLRWWLEEYNFDGFRFDGITSMLYVHHGINMGFT 

♦■400 ♦•^lO ♦•420 ♦•430 ^^440 

^530 f540 f550 ^560 f570 

GNYEEYFGLATDVDAVVYLMLVNDLIHGLFPDAITIGEDVSGMPTFCIPV 
GNY: EYF: ATDVDAVVYLML N: LIH : FPDA. . I: EDVSGMP. : . PV 
GNYNEYFSEATDVDAVVYLMLANNLIHKIFPDATVIAEDVSGMPGLSRPV 

♦•450 ^^460 ♦•470 ♦•480 ♦■490 

f580 ^590 f600 fSIO ^620 

QEGGVGFDYRLHMA I ADKR I ELLK -KRDEDWRVGD I VHTLTNRRWSEKCV 
EGG: GFDYRL MAI: DK: I: LK K. DEDW. : ::. : LTNRR. : EKC: 
SE GG I GF D Y RLA MA I PD KW I DY LK NKN DEDWS MK E VT SS LTN RR YTE KC I 

♦•500 ♦■510. ^620 ♦530 ♦■540 
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^30 f640 f650 ^60 ^670 

SYAESHDQAL VGDKT I AFWLMDKDMYDFMALDRPSTSL I DRG I ALHKM I R 
: YAESHDQ: : VGDKTIAF LMDK: MY. M: : : : : : DRGI ALHKMI- 
AYAESHDQS I VGDKTI AFLLMDKEMYSGMSCLTDASP VVDRG I ALHKMI H 
*-550 . -^560 *-570 *-580 ^590 

^680 f690 f700 f710 ^720 

LVTMGLGGEGYLNFMGNEFGHPEWIDFPRAEQHLSDGSVIPGNQFSYDKC 
: TMiLGGEGYLNFMGNEFGHPEWIDFPR GN: SYDKC 

FFTMALGGEGYLNFMGNEFGHPEWIDFPR EGNNWSYDKC 

^00 ^10 ^20 ^630 

^730 ,r740 ^750 ^760 ^^770 

RRRFDLGDAEYLRYRGLQEFDRPMQYLEDKYEFMTSEHQFISRKDEGDRM 
RR: . : L: D: E. LRY: : : . FDR: M: L: : K: . F: : S. . Q: : S. . D: : : : : 
RRQWNLADSEHLRYKFMNAFDRAMNSLDEKFSFLASGKQIVSSMDDDNKV 
*-640 ^50 ^60 ^70 ^80 

^780 f790 ^800 ^810 ^820 

IVFEKGNLVFVFNFHWTKSYSDYRIACLKPGKYKVALDSDDPLFGGFGRI 
: VFE: G: LVFVFNFH . : : Y. : Y: : : C PGKY: VAL: SD. FGG GR 

VVFERGDLVFVFNFHPNNTYEGYKVGCDLPGKYRVALGSDAWEFGGHGRA 
^690 *-700 *-710 *-720 *-730 

^830 ^840 f850 ^860 

DHNAEYFT —--FEGWYDDRPRS IMVYAPCKTAVVYALVDKEEEEE 

• H: . : . FT E. : : : RP. S: . V : P : T V. Y VD. E 

GHDVDHFTSPEGIPGVPETNFNGRPNSFKVLSPARTCVAYYRVDERMSET 
^740 *-750 *-760 ^770 ^780 

^870 

EEEEEEV 

E: : . : : 

EDYQTDI 
'^■790 
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^10 f20 f30 f40 

MVYTLSGVRFPTVPSVYKSNGFSSNGDRRNANVSVFLKKH--SLSRKILA 
MVYT: SG: RFP. : PS: . KS : . DRR. : : S FLK: : S: SR L 

MVYTISGIRFPVLPSLHKS---TLRCDRRASSHSFFLKNNSSSFSRTSLY 

^50 ^60 ^70 f80 f90 

EKSSYNSEFRPSTVAASGKVLVPGTQSDSSSSSTDQFEFTETSPENSPAS 
. K S : SE : ; ST: A. S: KVL: P. . Q D: S S : DQ: E . : . : : E- • 
AKFSRDSETKSST I AESDK VL I PEDQ- DNSVSLADQLENPD I TSEDAQNL 
*-50 ^0 ^70 *-80 -^90 

*^100 fllO fl20 ^130 f140 

TDVDSSTMEHASQIKTENODVEPSSDLTGSVEELDFASSLQLQEGGKLEE 
. Di TM. ! I I I I.I I ... I II • • • s 

ED L - — T MK DGNK YN I D -E STSS YREV GD EKG SV TSS SL VDV NTDTQ - - A 
^100 ^110 ^120 ^130 ^1 

^150 ^160 ^170 ^180 ^190 

SKTLNTSEETIIDESDRIRERGIPPPGLGQKIYEIDPLLTNYRQHLDYRY 
• KT S:..: :. :I IPPPG GQKIYEIDPLL . . RQHLD: RY 

KKTSVHSDKKVKVDKPKI IPPPGSGQK I YE IDPLLQAHRQHLDFRY 

*^150 *-160 *-170 ^180 

^200 f210 ^220 f230 f240 

SQYKKLREAIDKYEGGLEAFSRGYEKMGFTRSATGITYREWALGAQSAAL 
: QYK: : RE. IDKYEGGL: AFSRGYEK. GFTRSATG ITYREW: GA: SAAL 

GQYKRIREEIDKYEGGLDAFSRGYEKFGFTRSATGITYREWGPGAKSAAL 
^190 *-200 ^^210 ^220 ^230 

f250 f260 ,r270 ^280 ^290 

I GDFNNWDANAD 1 MTRNEFGVWE I FLPNNVDGSPA IPHGSRVK I RMDTPS 
: GDFNNW: : NAD: MT: : . FGVWEIFLPNN. DGSP: IPHGSRVK I: MDTPS 

VGDFNNWNPNADVMTKDAF6VWEI FLPNNADGSPPIPHGSRVKI HMDTPS 
'^240 -^250 ^260 ^270 *-280 

f300 ^310 ^320 ^330 ^340 

GVKDSIPAWINYSLQLPDEIPYNGIHYDPPEEERYIFQHPRPKKPKSLRI 
G: KDSIPAWl: : S: Q P: EIPYNGI. YDPPEEE: Y: F: HP: PK: P: S: RI 

GIKDSIPAWIKFSVQAPGEIPYNGIYYDPPEEEKYVFKHPQPKRPQSIRI 
^•290 ^300 ^310 ^320 ^330 

^350 ,r360 ^370 ^380 ^390 

YESH I GMSSPEPK I NSYVNFRDEVLPR I KKLG YN ALQ I MA I QEHSYY ASF 
YESHIGMSSPEPKIN: Y. NFRD: VLPR I KKLGYNA: QIMAIQEHSYYASF 
YESH IGMSSPEPKINTYANFRDDVLPRIKKLGYNAVQIMAIQEHSYY ASF 
^340 ^350 ^360 *'370 *-380 

f400 <r410 f^420 f430. ,r440 

GYHVTNFFAPSSRFGTPDDLKSLIDKAHELGIVVLMDIVHSHASNNTLDG 
GYHVTNFFAPSSRFGTP: DLKSLID: AHELG: : VLMDIVHSH: SNNTLDG 
GYHVTNFFAPSSRFGTPEDLKSLIDRAHELGLLVLMDIVHSHSSNNTLDG 
^390 HOO H]0 H20 H30 
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f450 f460 f470 f480 f490 

LNMFDCTDSCYFHSGARGYHWMWDSRLFNYGNWEVLRYLLSNARWWLDAF 
LNMFD TD: YFH: G: RGYHWMWDSRLFNYG: WEVLRYLLSNARWWLD • 
LNMFDGTDGHYFHPGSRGYHWMWDSRLFNYGSWEVLRYLLSNARWWLOEY 
M40 M50 M60 -^470 ^480 

^500 ^510 . ^520 f530 ^540 

KFDGFRFDGVTSMMYIHH6LSVGFTGNYEEYFGLATDVDAVVYLMLVNDL 
KFDGFRFDGVTSMMY. HHGL V: FT6NY. EYFGLATDV: AVVY: MLVNDL 

KFDGFRFDGVTSMMYTHHGLQVSFTGNYSEYFGLATDVEAVVYMMLVNDL 
^90 ^500 ^510 ^520 S30 

^550 ^560 ^570 ^580 ^590 

IHGLFPDAITIGEDVSGMPTFCIPVQEGGVGFDYRLHMAIADKRIELLKK 
IHGLFP: A: : I GEDVSGMPTFC: P. Q: GG: GF: YRLHMA: ADK: lELLKK 

IHGLFPEAVSIGEDVSGMPTFCLPTQDGGIGFNYRLHMAVADKWIELLKK 
^540 ^550 ^560 ^570 ^580 

fSoo ^10 ^20 ^630 ^eno 

RDEDWRVGDI.VHTLTNRRWSEKCVSYAESHDQALVGDKTIAFWLMDKDMY 
: DEDWR: GDIVHTLTNRRW EKCV YAESHDQALVGDKT: AFWLMDKDMY 
QDEDWRMGDIVHTLTNRRWLEKCVVYAESHDQALVGDKTLAFWLMDKDMY 
S90 ^600 ^10 ^620 ^630 

^650 ^660 ^70 ^680 ^690 

DFMALDRPSTSLIDRGIALHKMIRLVTMGLGGEGYLNFMGNEFGHPEWID 
DFMALDRPST: LIDRGIALHKMIRL: TMGLGGEGYLNFMGNEFGHPEWI D 
DFMALDRPSTPLIDRGIALHKMIRLITMGLGGEGYLNFMGNEFGHPEWID 
^640 ^50 ^60 ^70 -^80 

f700 ^710 ^720 f730 ^7H0 

FPRAEQHLSDGSV I PGNQFSYDKCRRRFDLGDAEYLRYRGLQEFDRPMQY 
FPR: EQHL: : G. : : PGN: SYDKCRRRFDLGDA: YLRY: G: QEFDR: MO 
FPRGEQHLPNGK I VPGNNNSYDKCRRRFDLGDADYLRYHGMQEFDRAMQH 
^690 ^700 ^710 ^720 ^730 

f750 ^760 f770 ^^780 ^790 

LEDK YEFMTSEHQF I SRKDEGDRM I VFEKGNLVFVFNFHWTKSYSDYR 1 A 
LE: . Y. FMTSEHQ: ISRK: EGDR: I: FE: : NLVFVFNFHWT: SYSDY- ■ • 
LEETYGFMTSEHQY ISRKNEGDRV I IFERDNLVFVFNFHWTNSYSDYKVG 
^740 ^750 ^760 *-770 ^780 

f800 ^810 f820 ^830 ,r840 

CLKPGKYKVALDSDDPLFGGFGRIDHNAEYFTFEGWYDDRPRSIMVYAPC 
CLKPGKYK: . LDSDD. LFGGF. R: : H. AEYFT EGWYDDRPRS: : VYAP 

CLKPGKYKI VLDSDDTLFGGFNRLNHTAEYFTSEGWYDDRPRSFLVYAPS 
^790 ^800 ^810 ^820 ^830 

f850 ^860 f870 

KTAVVYALVDKEEEEEEEEEEEVAA 

: TAVVYAL. D E. E E . : . V. : 

RTAVVYALADGVESEPIELSDGVES 
*-840 *-85b ^860 
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1 TT(i-AT- 

1 TTGA-r 

1 — — — B ga4 



72 TTTCTCnAATTCCAACCABG^ATGAATAAAAGGAT-A 

73 TTTCTCTTAATTCCAACCAAGG-AATGAATAAAAGGAT-A 
71 TTTCTCnAATTCCAACCAAGG-AATGAATAAAAgGAT-A 
165 TTTCTCTOATTCCAACCAAGG-AATGAATilAAABGAiiA 

191 TGTACAAATCTAATGGAnCAGCAGTAATGGTGATCGGAG 

191 tgtacaaatctaatggattcagcagtaatggtgatcggag 

189 tgtacaaatctaatggattcagcagtaatggtgatcggag 

274 tgtacaaatctaatggattcagcagtaatggtgatcggag 

311 aanccgaccttctacagttgcagcatcggggaaagtcct 

311 aattccgaccttctacagttgcagcatcggggaaagtcct 

309 aaJccgaccttctac/Ettgcagcatcggggaaagtcct 

394 aatSccgaccttctacagttgcagcatcggggaaagtcct 

431 cagcatcaactgatgtagatagttcaacaatggaacacgc 

431 cagcatcaactgatgtagatagttcaacaatggaacacgc 

429 cagcatcaactgatgtagatagttcaacaatggaacacgc 

514 cagcatcaactgatgtEgatagttcaacaatggaacacgc 

551 catcactacaactacaagaaggtggtaaactggaggagtc 

551 catcactacaactacaagaaggtggtaaactggaggagtc 

549 catcactacaactacaagaaggtggtaaactggaggagtc 

634 CATCAaACAACTACAAGAAGGTGGTAAACTGGAGGAGTC 

671 TTGGTCAGAAGATTTATGAAATAGACCCCCTTTTGACAAA 

671 TTGGTCAGAAGATTTATGAAATAGACCCCCTTTTGACAAA 

669 TTGGTCAGAAGATTTATGAAATAGACCCCCTTTTGACAAA 

754 ttggtcagaagatttatgaaatagacccccttttgacaaa 

791 aagdjttttctcgtggttatgaaaaaatgggtttcactcg 

791 aag( ^tttt ctcgtggttatgaaaaaatgggtttcactcg 

789 aagctttttctcgtggttatgaaaBaatgggtttcactcg 

874 AAGli 1 1 1 iCTCGTGGTTATGAAAAAATGGGTTTCACTCG J 
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HBgggccttgaactcagcaatttgacactcagttagttac^ 

tggggccttgaactcagcaatttgacactcagttagttac 

tggggccttgaactcagcaatttgacactcagttagttac 

ibeehhbutggggccttgaactcagcaatttgacagtcagttagttac 

gatttgtaaaaaccctaaggagagaagaagaaagatggtgtataBactctct 
gatttgtaaaaaccctaaggagagaagaagaaagatggtgtatacactctct 
gattt gtaaaaaccct aaggagagaagaagaaagatggtgtatacactctct 
gatttgHHHIaaggagagaagaagaaagatggtgtatacactctct 

gaatgctaatgtttctgtattcttgaaaaagcactctctttcacggaagatc 
gaatgctaatgmctgtattcttgaaaaagcacrctctttcacggaagatc 
gaatgctaalbmctgtattcttgaaaaabcacrctctttcacggaagatc 
gaatgctaatgtttcrgtattcrtgaaaaagcacractttcacggaagatc 

tgtgccrggaarccagagtgatagctcctgatcctcaacagaccaatttgag 
tgtgcctggaaeccagagtgatagctcctcatcctcaacagaccaatttgag 
tgtjcctggaatccagagtgatagctcctcatcctcaacag/ilcaatttgag 
tgtecctggaatccagagtgatagctcacatcacaacagaccaatttgag 

tagccagattaaaaagagaacgatgacgttgagccgtcaagtgatcttaca 
tagccagattaaaactgagaacgatgacgttgagccgtcaagtgatcttaca 
tagccagattaaaactgagaacgatgacgttgagccgtcaagtgatgttaca 
tagccagattaaaactgagaacgatgacgttgagccgtcaagtgatcttaca 

taaaacattaaatacttctgaagagacaattattgatgaatagataggatc 
taaaacattaaatacttctgaagagacaattattgatgaatctgataggatc 
taaaacattaaatacttctgaagagacaattattgatgaatctgataggatc 
taaaacattaaatacttctgaagagacaattattgatgaatctgataggatc 

ctatcgtcaacaccttgattacaggtattcacagtacaagaaactgagggag 
ctatcgtcaacaccttgattacaggtattcacagtacaagaaactgagggag 
ctatcgtcaacaccttgattacaggtattcacagtacaagaaactgagggag 
ctatcgtcaacaccttgattacaggtattcacagtacaagaaaBtgagggag 

tagtgctacaggtatcacttaccgtgagtgggctcctggtgcccagtcagct 
tagtgctacaggtatcacttaccgtgagtgggaciitggtgcccagtcaga 
tagtgctacaggtatcacttaccgtgagtgggacctggtgcccagtcaga 
tagtgaacaggtatcacttaccgtgagtgggctcctggtgcccagtcagct/ 
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ACTCCTATCACTTATCAGATCTCTATTT llcon . seq 
ACTCaATCACTTATCAGATCTCTATTT 19con . seq 
ACTBcBATCACTTATCAGATaCTATTT lOcon.seq 
ACTCCTATCAClflATCAGATCTCTATTT psbeZcon . seq 

GGAGTTCGTTTTCCTACTGTTCCATCAG ilcon.seq 
GGAGnCGTTTTCCTACTGTTCCATCAG 19con.seq 
GGAGTTCGTTTTCCTACTGTTCCATCAG 10con . seq 
GGAGTTCGTTTTCaACTGTTCCATCAG psbeZcon . seq 

TTGGCTGAAAAGTCTTCTTACAAnCCG Ilcon.seq 
TTGGCTGAAAAGtCTTCTTACAATTCCG 19con.seq 
TTGGCTGAAAAGTCTTCTTACAATTCCG 10con.seq 
TTGGCTGAAAAGTCTTCTTACBaTTCCG psbeZcon . seq 

TTCACTGAGACATCTCCAGAAAATTCCC llcon . seq 
TTCACTGAGACATCTCCAGAAAATTCCC IQcon.seq 
TTcBaGAGACATCTCCAGAAAAnCCC lOcon.seq 
TTCACTGAGACaBcTCCAGAAAATTCCC psbeZcon. seq 

GGAAGTGTTGAAGAGCTGGATTTTGCTT llcon . seq 
GGAAGTGTTGAAGAGCTGGATTTTGCTT IScon.seq . 
GGAAGTGTTGAAGAGCTGGATTTTGCTT lOcon.seq 
GGAAGTGTTGAAGACfiTGGATTTTGCTT psbeZcon. seq 

AGAGAGAGGGGCATCCCTCCACCTGGAC llcon . seq 
AGAGAGAGGGGCATCCCTCCACCTGGAC 19con . seq 
AGAGAGAGGGGCATCCCTCCACaGGAC 10con . seq 
AGAGAGAGGGGCATCCCTCCACCTGGAC psbeZcon . seq 

GCAATTGACAAGTATGAGGGTGGTTTGG llcon . seq 
GCAATTGACAAGTATGAGGGTGGTTTGG 19con . seq 
GCAATTGACAAGTATGAGGGTGGTTTGG lOcon.seq 
GCAATTGACAAGTATGAGGGTGGTTTGG psbeZcon . seq 

GCCCTCATTGGAGATTTCAACAATTGGG llcon . seq 
GCCCTCATTGGAGATTTCAACAATTGGG 19con . seq 
GCCCTCATTGcBGATTTCAACAAnGGG 10con . seq 
GCfiCTCATTGGAGATTTCAACAATTGGG psbeZcon .seq F jg . 8 
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910 acgcaaatgagacattatgactcggaatgaatttggtgtc 

911 acgcaaatgagacattatgactcggaatgaatttggtgtc 
909 acgcaaatgctgacBttatgactcggaatgaatttggtgtc 
994 acgcaaatgagacattatgactcggaatgaatttggtgtc 

1030 ctccatcaggtgttaaggattccattcctgcttggatcaac 

1031 accatcaggtgnaagganccattcctgcttggatcaac 
1029 ctccatcaggtgttaaggattccattcctgcttggatcaac 
1114 ctBcatcaggtgttaaggattccattcctgcttggatcaac 

1150 aacacccacggccaaagaaaccaaagtcgctgagaatatat 

1151 aacacccacggccaaagaaaccaaagtcgctgagaatatat 
1149 aacacccacggccaaagaaaccaaagtccBtgagaatatat 
1234 aacacccacggccaaagaaaccaaagtccagagaatatat 

1270 taaaaaa-gcttgggtacaatgcgBtgcSaattatggctat 

1271 taaaaaa-gcttgggtacaatgccBtgcaaattatggctat 
1269 taaaaaabgcttgggtacaatgcggtgcaaattatggaat 
1354 taaaaaaBcttgggtacaatgcggtgcaaattatggctat 



1389 gacgaccttaagtcnbgattgataaagctcatgagaagg 

1390 gacgaccttaagtctttgattgataaagacatgagctagg 
1389 gacgaccttaagtctttgattgataaagctcatgagctagg 
1473 gacgaccttaagtctttgattgataaagctcatgagaagg 

1509 gatagttgttactttcactctggagctcgtggttatcattg 

1510 gatagttgttactttcacraggagctcgtggttatcattg 
1509 gatagttgttactttcactctggagctcgtggttatcattg 
1593 gatagttgttactttcacraggagctcgtggttatcattg 

1628 gatgagttcaaatttgatggatttagattflgatggtgtgac 
1630 gatgBgttcaaatttgatggatttagatttgatggtgtgac 

1629 gatgagttcaaatttgatggatttagatttgatggtgtgac 
1713 gatgagtBcaaatttgEtggatttagatttgatggtgtgac 

1748 gtggatgctgttgtgtatagatgaggtcaacgatcttat 
1750 gtggatgctgttgtgtatctgatgctggtcaacgatcttat 

1749 gtggatgctgttgtgtatctgatgctggtcaacgatcttat 

1833 GllGATGCTGffiGTGTATCTGATGCTGGECAACGATCTTAT J 
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TGGGAGA 
TGGGAGA 

tcEgaga 

TGGGAGA 



CTGCCAAATAATGTGGATGGTTCTCCTGCAATTC 
CTGCCAAATAATGTGGATGGTTCTCCTGCAATTC 
CTGCCAAATAATGTGGATGGTTCTCCTGCAATTC 
CTGCCAAATAATGTGGATGGTTaCCTGCAATTC 



TACTCTTTACAGCTTCCTGATGAAATTCCATATAATGGAATATATT 
TACTCTTTACAGCTTCCTGATGAAATTCGATATAATGGAATAgATT 
TACTCTTTACAGCTTCCTGATGAAATTCCATATAATGGAATATATT 
TACTCTTTACAGCTTCCTGATGAAATTCCATATAATGGAATATATT 

GAATCTCATATTGGAATGAGTAGTCCGGAGGCTAAAATTAACTCAT 
GAATCTCATATTGGAATGAGTAGTCCGGAGCCTAAAAnAACTCAT 
GAATCTCATATTGGAATGAGTAGTCCGGAGCaAAAATTAAaCAT 
GAATCTCATATTGGAATGAGTAGTCCGGAGCaAAAATTAACTCAT 

TCAAGAGCATTCTTATTATGCTAGTTTTGGTTATCATGTCACAAAT 
TCAAGAGCATTCTTATT/flGaAGTTTTGGTTATCATGTCACAAAT 
TCAAGAGCATTCTTATTATGCTAGTTTTGGTTATCATGTCACAAAT 
TCAAGAGCATTCTTATTATGaAGTTTTGGTTATCATGTCACAAAT 

AATTGTTGTTaCATGGACAlfiGTTCACAGCCATGCATCAAATAAT 
AATTGTTGTTCTCATGGACATTGTTCACAGCCATGCATCAAATAAT 
AATTGTTGTTCTCATGGACATTGTTCACAGCCATGCATCAAATAAT 
AATTGTTGTTCTCATGGACATTGTTCACAGCCATGCATCAAATAAT 

GATGTGGGATTlcCGCCTCTTTAAaATGGAAACTGGGAGGTACTT 
GATGTGGGATTCCCGCCTCTTTAAaATGGAAAaGGGAGGTACTT 
GATGTGGGATijCCGCCTCTTTAACTATGGAAAaGGGAGGTACTT 

gatgtgggattcccgcctctttaactatggaaactgggaggtactt 

atcaatgatgtatactcaccacggattatcggtgggattcactggg 
atcaatgatgtataBtcaccacggattatcggtgggattcactggg 
atcaatgatgtEtactcaccacggattatcggtgggattcactggg 
atcaatgatgtatactcaccacggattatcggtgggattcactggg 

tcatBggcttttcccagatgcaattaccattggtgaagatgttagc 
tcatgggcttttcccagatgcaattaccattggtgaagatgttagc 
tcatgggcttttcccagatgcaattaccattggtgaagatgttagc 
tcatgggcttttcccagatgcaattaccattggtgaagatgttagc-^ 



^Sheets 



Fig. 8 

SHEET 5 



SUBSTITUTE SHEET (RULE 26) 



wo 96/34968 



PCT/GB96AI107S 



29/75 



CTCATGGGTCCAGAGTGAAGATACGTATGGACA llcon . seq 
CTCATGGGTCCAGAGTGAAGATACGTATGGACA 19con . seq 
CTCATGGGTCCAGAGTGAAGATACGTATGGACA lOcon . seq 
CTCATGGGTCCAGAGTGAAGATAC(flATGGACA psbeZcon . seq 

ATGATCCACCCGAAGAGGAGAGGTATATCTTCC llcon . seq 
ATGATCCACCCGAAGAGGAGAGGTATATCTTCC 19con . seq 
ATGATCCACCCGAAGAGGAGAGGTATATCTTCC 10con . seq 
ATGATCCACCCGAAGAGGAGAGGTAtEtCTTCC psbeZcon . seq 

ACGTGAATTTTAGAGATGAAGTTCTTCCTCGCA llcon . seq 
ACGTGAATTTTAGAGATGAAGTTCTTCCTCGCA 19con . seq 
ACGTGAATTTTAGAGATGAAGTTCTTCCTCGCA 10con . seq 
ACGTGAATTTTAGAGATGAAGTTCTTCCTCGCA psbeZcon . seq 

TTTTTT GCACCAAGCAGCCGTTTTGGAACGCCC llcon . seq 
TTTTTTGCACCAAGCAGCCGTTTTGGAACGCCC IScon.seq 
TTTTTTGCACCAAGCAGCCGTTTTGGAACGCCC lOcon .seq 
I M 1 1 1 GCACCAAGCAGCCGTTTTGGAACGCCC psbeZcon . seq 

ACTTTAGATGGACTGAACATGTTTGACGGCACC llcon. seq 

actttagatggactgaacatgtttgacBgcacc 19con.seq 

ACTTTAGATGGACTGAACATGTTTGACGGCACE lOcon.seq 
ACTTTAGATGGACTGAACATGTTTGACGGCAqI psbeZcon . seq 

AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG llcon. seq 
AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG 19con . seq 
AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG 10con . seq 
AGGTATCTTCTCTCAAATGCGAGATGGTGGTTG psbeZcon . seq 

AACTACGAGGAATACTTTGGACTCGCAACTGAT llcon . seq 
AACTACGAGGAATACTTTGGACTCGCAACTGAT 19con . seq 
AACTACGAGGAATACTTTGGACTCGCAACTGAT 10con . seq 
AACTACGAGGAATACTTTGGACTCGCAACTGAT psbeZcon . seq 

GGAATGCCGACATTTTGTATTCCCGnCAAGAT llcon . seq 

ggaatgccgacattttgtattcccgtGcaagaE 19con.seq r* n 

GGAATGCCGACATTTTGtStTCCCGTTCAAGAT 10con.seq 1 IQ. O 

GGAATGCCGACATTTTGTATTCCCGTTCAAGAT psbeZcon . seq SHEET 6 
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1868 GGGGGTGTTGGCTTTGACTATCGGCTGCATATGGCAATTGC^ 
1870 GGGGGTGTTGGCTTTGACTATCGGCTGCATATGGCAAnGC 

1869 GGGGGTGTTGGCTTT6ACTATCGGCTGCATATGGCAATTGC 
1953 GGGGGTGTTGGCTTTGACTATCGGCTGCATATGGCAAnCC 

1988 AGATGGTCGGAAAAGTGTGTTTCATAGGCTGAAAGTCATGA 

1990 agatggtcggaaaagtgtgtttcatacgctgaaagtcatga 

1989 agatggtcggaaaagtgtgtttcatacgctgaaagtcatga 
2073 agatggtcggaaaagtgtgtttcatacgctgaaagtcatga 

2108 ccgEcaacatcattaatagatcgtgggatagcattgcacaa 
2110 ccgtcaacatcattaatagatcgtgggatagcattgcacaa 

2109 ccgtcaacatcattaatagatcgtgggatagcattScacaa 
2193 ccgtcaacatcattaatagatcgtgggatagcattgcacaa 

2228 tggattgatttccctagggctgaeceacacciitctgatgg 
2230 tggattgatttccctagggctgaacaacacctctctgatgg 

2229 tggatt6atttccctagggctgaagaacacctctctgatgg 
2313 tggattgatttccctagggctgaacaacacctctctgatgg 



2348 TACcBTGGGnECAAGAATTTGACBGGGCTATGCAGTATCT 
2350 TACCGTGGGTTGCAAGAATTTGACCG^CTATGCAGTATCT 

2349 TACCGTGGGTTGCAAGAATTTGACCGGGCTATGCAGTATa 
2433 TACCGTGGGTTGCAAGAATTTGACCGGGCTATGCAGTATCT 

2468 GAAaBaGGAAACCTAGTTTiBgTCTTTAATTTTCACTGGAC 
2470 GAAAAAGGAAACCTAGTTTTTGTCTTTAATTTTCACTGGAC 

2469 GAAAAAGGAAACCTAGI 1 1 1 1 GTCTTTAATTTTCACTGGAC 
2553 GAAAAAGGAAACCTAGTTTTTGTCTTTAATTTTCACTGGAC 



GGTGGCTTCGGGAGAATTGATCATAATGCCGAATATTT 
GGTGGCTTCGGGAGAATTGATCATAATGCCGAATATTT 
GGTGGCTTCGGGAGAATTGATCATAATGCCGAATATTT 

ggtggcttcgggagaattgatcataatgccgaatBttt 



,Fig.8 
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2590 
2589 
2673 



2708 CTAGTAGACAAA|iiAGAA( 

2710 CTAGTAGACAAAGAAGAAGAAGAAGAAGAAGAA(<?»™*i 

2709 CTAGTAGACAAAGAAGAAGAAGAAGAAGAAGAAG- 
2793 CTAGTAGACAAAGAAGAAGAAGAAGAAGAA( 
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TGATAAATGGATTGAGTTGCTCAAGAAACGGGATGAGGATTGGAGA^ 

tgataaaBggattgagttgctcaagaaacgggatgaggattggaga 
tgataaatggattgagttgctcaagaaacgggatgaggattggaga 

TGATAAATGGATTGAGTTGCTCAAGAAACGGGATGAGGATTGGAGA 

TCAAGCTCTAGTCGGTGATAAAACTATAGCATTCTGGCTGATGGAC 
TCAAGCTCTAGTCGGTGATAAAACTATAGCATTCTGGCTGATGGAC 
TCAAGCTCTAGTCGGTGATAAAACTATAGCATTCTGGCTGATGGAC 
TCAAGCTCTAGTCGGTGATAAAACTATAGCATTCTGGCTGATGGAC 

GATGATTAGGCTTGTAACTATGGGATTAGGAGGA6AAGGGTACCTA 
GATGATTAGGGTGTAACTATGGGATTAGGAGGAGAAGGGTACCTA 
GATGATTAGGCTTGTAACTATGGGAnAGGAGGAGAAGGGTACCTA 
GATGATTAGGCTTGTAACTATGGGATTAGGAGGAGAAGGGTACCTA 

ctcagtaattcccggaaaccaattcagttatgataaatgcagacgg 
ctcagtaatScccggaaaccaattcagttatgataaatgcagacgg 
ctcagtaattcccggaaaccaattcagttatgataaatgcagacgg 

CTCAGTAATTCCCGGAAACCAATTCAGTTATGATAAATGGAGACGG 

tgaagataaatatgagtttatgacttcagaacaccagttcatatca 
tgaagataaatatgagtttatgacttcagaacaccagttcatatca 
tgaagataaatatgagtttatgacttcagaacaccagttcatatca 

TGAAGATAAATATGAGTTTATGACTTCAGAACACCAGTTCATATCA 

aaaGagctattcagactatcgcataggctgcctgaagcctggaaaa 
aaaaagctattcagactatcgcata(flctgcctgaagcaggaaaa 
aaaaBgctattcagactatcgcataggctgcctgaagcctggaaaa 
aaaaagctattcagactatcgcataggctgEctgaagcctggaaaa 

CACClSTGAAGGATfiGTATGATGATCGTCCiiGTTCAATTATGGTG 
CACCTTTGAAGGATGGTATGATGATCGTGaCGTTCAATTATGGTG 
CACCTTTGAA6GATGGTATGATGATCGTCCTCGTTCAATTATGGTG 
CACCTTTGAAGGATGGTATGATGATCGTCCTCGTTCAATTATGGTG 



— — TAGCAGTAGTAGAAGAAj^^TiiGHHAAGAATGAACG 

SSEEtagca^agtagaagaagtagtagtagaagaagaatgaacg 

tagcagtagtagaagaagtagtagtagaagaagaatgaacg 

tagcagtagtagaagaagtagtagtagaagaagaatgaacg 
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GTGGGTGATATTGTTCATACACTGACAAATAGA llcon . seq 
GTGGGTGATATTGnCATACACTGACAAATAGA 19con . seq 
GTGGGTGATATTGTTCATACAaGACAAATAGA 10con . seq 
GTGGGTGATATTGTTCATACACTGACAAATAGA psbeZcon . seq 

AAGGATATGTATGATTTTATGGCTCTGGATAGA llcon . seq 
AAGGATATGTATGATTTTATGGCTCTGGATAGA 19con . seq 
AAGGATATGTATGATTTTATGGCTCTGGATAGA 10con . seq 
AAGGATATGTATGATTTTATGGCTflTGGATAGA psbeZcon . seq 

AATTTCATGGGAAATGAATTCGGCCACCCTGAG llcon . seq 
AATTTCATGGGAAATGAATTCGGCCACCCTGAG 19con . seq 
AATTTCATGGGAAATGAATTCGGCCACCCTGAG lOcon . seq 
AATTTCATGGGAAATGAATTCGGCCACCCTGAG psbeZcon . seq 

AGATTTGACCTGGGAGATGCAGAATATTTAAGA llcon . seq 
AGATTTGACaGGGAGATGCAGAATATTTAAGA 19con . seq 
AGATTTGACCTGGGAGATGCAGAATATTTAAGA 10con . seq 
AGATTTGACCTGGGAGATGCAGAATATTTAAGA psbeZcon . seq 

CGAAAGGATGAAGGAGATAGGATGATTGTATTT llcon . seq 
CGAAAGGATGAAGGAGATAGGATGATTGTATTT 19con . seq 
CGAAAGGATGAAGGAGATAGGATGATTGTATTT 10con . seq 
CGAAAGGATGAAGGAGATAGGATGATTGTATTT psbeZcon . seq 

TACAAGGTT^JCTTGGACTCAGATGATCCACTT llcon . seq 
TACAAGGTTGCCTTGGACTCAGATGATCCACTT 19con . seq 
TACAAGGTTGCCTTGGACtCAGATGATCCACTT 10con . seq 
TACAAGGTTGCCTTGGACTCAGATGATCCACTT psbeZcon . seq 

TATGCACCTAGTAGAACAGCAGTGGTCTATGCA llcon . seq 
TATGCACCTBgTaDaACAGCAGTGGTCTATGCA 19con . seq 
TATGCACCTAGTAGAACAGCAGTGGTCTATGCA 10con . seq 
TATGCACCTAGTAGAACAGCAGTGGTCTATGCA psbeZcon. seq 

AACTTGTGATCGCGTTGAAAGATTTGAACJra! llcon . seq 
AACTTGTGATCGCGnGAAAGATTTGAACG- - - 19con . seq 
AACTTGTGATCGCGTTGAAAGATTTGAACG— 10con . seq 
AACTTGTGATCGCGTTGAAAGATTTGAACG— psbeZcon . seq 
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2795 li|j|slsii|iiicfiACATAGAGCTTCTTGA( 

2827 ctacatagagcttcttgacgtatctggcaatat 

2814 - cBacatagagcttcttgacgtatctggcaatat 

2895 CTACATAGAGCTTCTTGACGTATCTGGCAATAT 

2898 AGAGATGAAGTGCTGAACAAA-CATATGTAAAATCGATGAA 

2937 agagatgaagtgctgaacaaa--catatgtaaaatcgatgaa 
2924 agagatgaagtgctgaacaaaEBcatatgtaaaatcgatgaa 
3005 agagatgaagtgctgaacaaa-catatgtaaaatcgatgaa 



'Sheet n 
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3012 
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3123 



GCCCAGTAGAAATCAATTATGTGAGACCTAAAAAArAATAAr 
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B^tcagtcttggcggaattBcatgtgaca^aaggtttgcaEtt 

TGCAKAGTCTTGGCGGAATTTCATGTGACAg-AAGGTTTGCAATT 
TGCATllAGTCTTGGCGGAArrTCATGTGACAA-BAGGTTTGCAATT 
TGCATCAGTCTTGGCGGAATTTCATGTGACAA-AAGGTTTGCAATT 

TTTATGTCGAATGCTGGGACGATCGAATTCCTGCAGCC 
TTTATGTCGAATGCTGGGACGATCGAATTCCTGCAG 
TTTATGTCGAATGCTGGGACGATC GAATT CCTG CAGCC 
TmTGTCGAATGCTGGGACGEiflEBcffiGHiBEiEHE 



,Fig.8 
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CATAAAATGGAAATAGTGCTGATCTAATGATGTTTTAAWrrMMMMA 
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CTTTCCACTATOGTAGTEcaBcGATATACGC llcon . seq 
CTTTCCACTAnAGTAGTGCAACGATATACGC 19con . seq 
CTTTCCACTATTAGTAGTGCAACGATATACGC 10con . seq 
CTTTCCACTATTAGTAGTGCAACGATATACGC psbeZcon . seq 



GTTCTGTAAATTGTCATCTGTTANATGTACA 



llcon . seq 
19 con. seq 
lOcon.seq 
psbeZcon . seq 



AAAAAAAAAAAAAACTCGA 



llcon. seq 
19con . seq 
lOcon.seq 
psbeZcon . seq 
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GGATGCTAATGTTTCTGTATTCTTGAAAAAGCACTCTCTTTCACGG' 

I 1 1 1 1 . I I I ... I I 

CCTACGATTACAAAGACATAAGAACTTTTTCGTGAGAGAAAGTGCC 
ANVSVF LK KHSLSR 

TTCTACAGTTGCAGCATCGGGGAAAGTCCTTGTGCCTGGAAYCCAG 
— I ■ ' — ' I ' ' I I — 

AAGATGTCAACGTCGTAGCCCCTTTCAGGAACACGGACCTTRGGTC 
STVAASGKVLVPG7Q 

GACATCTCCAGAAAATTCCCCAGCATCAACTGATGTAGATAGTTCA 

> ' I I ' — ' I I — — I — . . . I . 

CTGTAGAGGTCTTTTAAGGGGTCGTAGTTGACTACATCTATCAAGT 
TSPENSPASTDVDSS 

TGAGCCGTCAAGTGATCTTACAGGAAGTGTTGAAGAGCTGGATTTT /cK^ft 
1 1 1 1 h 1 .... I 1 ^"^^ 

ACTCGGCAGTT.CACTAGAATGTCCTTCACAACTTCTCGACCTAAAA 
EPSSDLTGSVEELDF 

TAAAACATTAAATACTTCTGAAGAGACAATTATTGATGAATCTGAT 

' — I \ 1 1 1 ' I .... I — 

ATTTTGTAATTTATGAAGACTTCTCTGTTAATAACTACTTAGACTA 
KTLN TSEETII DESD 



Hinc II 

GATTTATGAAATAGACCCCCTTTTGACAAACTATCGTCAACACCTT 
1 i — 1 H — 1 ' ' ' ' I ■ ■ 1 — I— 



TAAATACTTTATCTGGGGGAAAACTGTTTGATAGGAGTTGTGGAA 
lYEIDPLLTNYRQHL^' 



Rg.9 



SHEET 1 
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Bglll 



AAGATCTTGGCTGAAA.AGTCTTCTTACAATTCCGAATCCCGACC 



TTCTAGAACCGACTTTTCAGAAGAATGTTAAGGCTTAGGGCTGG 
K I L AEKS SYNSESRP 

AGTGATAGCTCCTCATCCTCAACAGACCAATTT6AGTTCACTGA 



TCACTATCGAGGAGTAGGAGTTGTCTGGTTAAACTCAAGTGACT 
SDSSSSSTDQFEFTE 



ACAATGGAACAC6CTAGCCAGATTAAAACTGAGAACGATGACGT 

I ' ' I I 1 1 ' ' ' ' I 270 

T6TTACCTTGTGCGATCGGTCTAATTTTGACTCTTGCTACTGCA 

TMEHASQIKTENDDV 



GCTTCATCACTACAACTACAAGAAGGTGGTAAACTGGAGGAGTC 

I H— M 1 ■ ■ • ' I ■ ■ ' ■ I ' 1 ■ ' ■ ' I 360 

CGAAGTAGTGATGTTGATGTTCTTCCACCATTTGACCTCCTCAG 

ASSLQLQEGG KLEES 



AGGATCAGAGAGAGGGGCATCCCTCCACCTGGACTTGGTCAGAA 

H -4-- — H I 1 ' — I- 450 

TCCTAGTCTCTCTCCCCGTAGGGAGGTGGACCTGAACCAGTCTT 

RIRERGIPPPGLGQK 



GATTACAGGTATTCACAGTACAAGAAACTGAGGGAGGCAATTGA 

1 I . . I .... I 1 I I h 540 

CTAATGTCCATAAGTGTCATGTTCTTTGACTCCCTCCGTTAACT 
DYRYSQYKKLREAID 



f 90 



f 180 
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CAAGTATGAGGGTGGTTTGGAAGCTTTTTCTCGTGGTTATGAAAAA^ 
' 1 I I -) 1 . I ■ 

GTTCATACTCCCACCAAACCTTCGAAAAAGAGCACCAATACTTTTT 
KYE.G .GLEAFSRGYEK 

Pyu it 

GGCTCCTGGTGCCCAGTCAGCTGCCCTCATTGGAGATTTCAACAAT 
' 1 ^ f— i '-h^ 1 1 , , , , ■ 

CCGAGGACCACGGGTCAGTCGACGGGAGTAACCTCTAAAGTTGTTA 
APGAQ S AALIGD FNN 



CTGGGAGATTTTTCTGCCAAATAATGTGGATGGTTCTCCTGCAATT 
— ^ h— H 1 I n . , , I ^ 

GACCCTCTAAAAAGACGGTTTATTACACCTACCAAGAGGACGTTAA 

WE I FLPNNVDGSP A I 

TGTTAAGGATTCC.ATTCCTGCTTGGATCAACTACTCTTTACAGCTT 
' 1 ' ' I ' ' I 1 . . I . 

ACAATTCCTAAGGTAAGGACGAACCTAGTTGATGAGAAATGTCGAA 
VK DSI PAWINY SLQL 

AGAGGA GAGGTATRTCTTCCAACACCCACGGCCAAAGAAACCAAAG 
' H 1 H 1 1 1 

TCTCCTCTCCATAYAGAAGGTTGTGGGTGCCGGTTTCTTTGGTTTC 

eery?fqhprpkkpkJ 



'sheet 
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ATGGGTTTCACTCGTAGTGCTACAGGTATCACTTACCGTGAGTG 
' I ' ' ' ' ' I ' ' I . H H 

TACCCAAAGTGAGCATCACGATGTCCATAGTGAATGGCACTCAC 
"GFTRSATGIT YREW 



TGGGACGCAAA TGCTGACATTATGACTCGGAATGAATTTGGTRT 
^ — ^ 1 ' — — H 1 . I 1- 

ACCCTGCGTTTACGACTGTAATACTGAGCCTTACTTAAACCACA 
WDANADIMTRNEFGV 

CCTCATGGGTCCAGAGTGAAGATACGYATGGACACTCCATCA GG 

GGAGTACCCAGGTCTCACTTCTATGCRTACCTGTGAGGTAGTCC 
P^^S SRVKIRMDTPSG 



CC 



TGATGAAATTCCATATAATGGAATATATTATGATCCACCCGA 



' ' ' ' ' — ' ' ' ' ' — ^ 900 
PDE IPYNGIYYDPPE 



GGACTACTTTAAGGTATATTACCTTATATAATACTAGGTGGGCT 



TCGCTGAGAATATATGAATCTCATATTGGAATGAGTAGTCCGGA 

AGCGACTCTTATATACTTAGAGTATAACCTTACTCATCAGGCCT 
SI-RIYESHIGMSSPE 
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40/75 



Xmn I 

GCCTAAAATTAACTCATACGTGAATTTTAGAGATGAAGTTCTTCCT'^ 

— ^ H ■ I . I . . I . I 1 .... I . 

CGGATTTTAATTGAGTATGCACTTAAAATCTCTACTTCAAGAAGGA 
PKINSYVNFRDE VLP 

TCAAGAGCATTCTTATTATGCTAGTTTTGGTTATCATGTCACAAAT 
' H 1 1 , 1 , 1 .... I -, 

AGTTCTCGTAAGAATAATACGATCAAAACCAATAGTACAGTGTTTA 

QEHSY YASFGYHVTN 

GTCTTTGATTGATAAAGCTCATGAGCTAGGAATTGTTGTTCTCATG 

^ h— H H -4-- 1 ^ 1 

CAGAAACTAACTATTTCGAGTACTCGATCCTTAACAACAAGAGTAC 
SLIDKAHELGIVVLM 



GAACATGTTTGACGGCACAGATAGTTGTTACTTTCACTCTGGAGCT 
" ^ 1 ' ' ' I I ... I I ^^ 

CTTGTACAAACTGCCGTGTCTATCAACAATGAAAGTGAGACCTCGA 
NMFDGTDSCYFHSGA 

AAACTGGGAGGTACTTAGGTATCTTCTCTCAAATGCGAGATGGTGG 

H ■ ' I I . ' r . I . . I ■ 

TTTGACCCTCCATGAATCCATAGAAGAGAGTTTACGCTCTACCACC 
NWE VLRYLLSNAR WW 

ATCAATGATGTATACTCACCACGGATTATCGGTGGGATTCACTGGG 
^ 1 ' r . . . . ( I 1 i 

TAGTTACTACATATGAGT6GTGCCTAATAGCCACCCTAAGTGACCC 
SM MYTHHGLSVGFTG. 
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41/75 



CGCATAAAAAASCTTGGGTACAATGCGGTGCAAATTATGGCTAT 

— ' ' I ' ' ' I I I 1- 1080 

GCGTATTTTTTSGAACCCAT6TTACGCCACGTTTAATACCGATA 
RI K7LGYNAVQIMAI 

TTTTTTGCACCAAGCAGCCGTTTTGGAACGCCCGACGACCTTAA 

— H 1 1 1 \ i 1 , ,. ,i7Q 

AAAAAACGTGGTTCGTCGGCAAAACCTTGCGGGCTGCTGGAATT 
f^FAPSSRFGTPD DLK 

GACATTGTTCACAGCCATGCATCAAATAATACTTTAGATGGACT 
' ' ' ' ' 1 I I I 1260 

CTGTAACAAGTGTCGGTACGTAGTTTATTATGAAATCTAC.CTGA 
DIV HSH ASNNTLD.GL 

Sad 

CGTGGTTATCATTGGATGTGGGATTCCCGCCTCTTTAACTATGG 

' ' ' ' ' .1 ' I I I ■ ■ I . I . I . . — M j- 1350 

GCACCAATAGTAACCTACACCCTAAGGGCGGAGAAATTGATACC 
RGY HWMWDSRLFNYG 

TTGGATGAGTTCAAATTTGATGGATTTAGATTTGATGGTGTGAC 

1 I \ . I . . . ■ I ■ , — . I . I . . I ]t\i\o 

AACCTACTCAAGTTTAAACTACCTAAATCTAAACTACCACACTG 
LDEFKFDGFRFDGVT 

AACTACGAGGAATACTTTGGACTCGCAACTGATGTGGATGCTGT 

TTGATGCTCCTTATGAAACCTGAGCGTT6ACTACACCTACGACA 
NYEEYFGLATDVDAV 

Fig. 9 SHEET 6 
SUBSTITUTE SHEET (RULE 26) 



wo 96/34968 



PCT/6B96/0I075 



42/75 
hiinc il 

.TGTGTATCTGATGCTGGTCAACGATCTTATTCACGGGCTTTTCCCA^ 
1 I I I . I I I ■ 

ACACATAGACTACGACCAGTTGCTAGAATAAGTGCCCGAAAAGGGT 
VYLMLVNDLIHGLFP 



TTGTATTCCCGTTCAAGATGGGGGTGTTGGCTTTGACTATCGGCTG 
\ 1 1 H I 1 1 

AACATAAGGGCAAGTTCTACCCCCACAACCGAAACTGATAGCCGAC 
CIPVQDGGVGFDYRL 

GGATGAGGATTGGAGAGTGGGTGATATTGTTCATACACTGACAAAT 

~ 1 — — H 'I ' ' I I — I 

CCTACTCCTAACCTCTCACCCACTATAACAAGTATGTGACTGTTTA 
DEDWRVGDIVHTLTN 



TCAAGCTCTAGTCGGTGATAAAACTATAGCATYCTGGCTGATGGAC 

' ' ' I ■ ' I ' I I . . I I . , , , I I 

ACTTCGAGATCAGCCACTATTTTGATATCGTARGACCGACTACCTG 
QALVGDKTIA7WLMD 



'Sheet 
6 



ATTAATAGATCGTGGGATAGCATTGCACAAGATGATTAGGCTTGTA 

1 1 I ■ 1— < 1 H h 

TAATTATCTAGCACCCTATCGTAACGTGTTCTACTAATCCGAACAT 
LID RGIALHKMIR 



lAACAT 
L V J 



Fig. 9 
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GATGCAATTA CCATTGGTGAAGATGTTAGCGGAATGCCGACATT 

' I ' i — — ' ' ' I ' ■ ' — I H- 1620 

CTACGTTAATGGTAACCACTTCTACAATCGCCTTACGGCTGTAA 
D AIT. IGEDVSGM PTF 

Ndel 

CATATGGCAATTGCTGATAAATGGATTGAGTTGCTCAAGAAACG 

' ' 1 — ' — 1 I ' M — H 1710 

GTATACCGTTAACGACTATTTACCTAACTCAACGAGT.TCTTTGC 
^^MAIAD KWIELLKKR 

AGAAGATGGTCGG AAAA.GTGTGTTTCATMCGCTGAAAGTCATGA 

' '■ ' > ' ' I ' I ■ I I ' ' I 1800 

TCTTCTACCAGCCTTTTCACACAAAGTAKGCGACTTTCAGTACT 
RRWSEKCVS7AESHD 



Hinc II 

AAGGATATGTAT GATTTTATGGCTCTGGATAGACCGTCAACATC 

' ' ' ' ' ' I ' ' I ' I ' I 1 I I 1890 

TTCCTATACATACTAAAATACCGAGACCTATCTGGCAGTTGTAG 
KDMYDFMALDRP STS 

Asp 718 

jKpnl 

ACTATGGGATT AGGAGGAGAAGGGTAC'CTAAATTTCATGGGAAA 

' — ' ' I ' ' I I . . I . . — M — « — f. 1980 

TGATACCCTAATCCTCCTCTTCCCATGGATTTAAAGTACCCTTT 
TMG LGGEGYLNFMGN 
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^EcoR I 

TGAATTCGG CCACCCTGAGTGGATTGATTTCCCTAGGGCTGARCAA-\ 
' ' ' I I I ' I I , I ■ 

ACTTAAGCCGGTGGGACTCACCTAACTAAAGGGATCCCGACTYGTT 
^P'SHPEWID FPRAEQ 

jSspl 

TGATAAATGC AGACGGAGATTTGACCTGGGAGATGCAGAATATTTA 
. ' ' I ■ [ I I I 

ACTATTTACGTCTGCCTCTAAACTGGACCCTCTACGTCTTATAAAT 

DKCRRRFDLGDAEYL 

TGAAGATAA ATATGAGTTTATGACTTCAGAACACCAGTTCATATPA 
^ ' I • ' — ^H— 1 1 ^ 1 I . 

ACTTCTATTTATACTCAAATACTGAAGTCTTGTGGTCAAGTATAGT 
^OKYEFMTSEHQ FIS 

CCTAGTTTTTGT CTTTAATTTTCACTGGACAAATAGCTATTCARAr 

' ' ' ' I I I ' . I I I 

GGATCAAAAACAGAAATTAAAAGTGACCTGTTTATCGATAAGTCTG 
'-Vf'VF NFHWTNSYSD 



GGACTCAGATGATCCACTTTTTGGTGGCTTC6GGAGAATTGATCAT 

' ' ' ' ' ' I ' ' y-^ 1 — t I , . 

CCTGAGTCTACTAGGTGAAAAACCACCGAAGCCCTCTTAACTAGTA 
DSDDPLFGGFGRID H 

YCGYYCAATTATGGTGTATGCACCTAGTAGAACAGCAGTGGTCTAT 

' ' ' ' ' I ' I I I ' I 

RGCRRGTTAA.TACCACATACGTG6ATCATCTTGTCGTCACCAGATA 
R'^IMVYAPSRT AVVY 



NGAAGAATTTT 

— H 2531 

NCTTCTTAAAA 

E E F 
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CACCTCTC TGATGGCTCAGTAATTCCCGGAAACCAATTCAGTTA 
1 ' 1 H — 1 1- 1- 

GTGGAGAGACTACCGA6TCATTAA6GGCCTTTGGTTAAGTCAAT 
HLSD GSVI PGNQFSY 



.Ncol 

AGATACCATGGGTTGCAAGAATTTGACCGGGCTATGCAGTATCT 

TCTATGGTACCCAACGTTCTTAAACTGGCCCGATACGTCATA6A 
RYHGLQ EFDRAMQ YL 

CGAAAGGATGA AGGAGATAGGATGATTGTATTTGAAARAGGAAA 

' ' ' ■ — ' ' ' ' ' I ■ I H 1 — 0050 

GCTTTCCTACTTCCTCTATCCTACTAACATAAACTTTYTCCTTT 
RKDEGDRMIVFE7 GN 

TATCGCATAGG CTGCCTGAAGCCTGGAAAATACAAGGTTGGCTT 

' ' ' ' I ' ' I I ' ' I 1. I 2340 

ATAGCGTATCCGACGGAC.TTCG6ACCTTTTATGTTCCAACCGAA 
Y R I G. C L K P G K Y K V G L 

Ssp I 

AATGCCGAATAT TTCACCTCTGAAGGATCGTATGATGATCGYCC 

' ■ ' I ' I'l l I I I II I I I , I 2430 

TTACGGCTTATAAAGTGGAGACTTCCTAGCATACTACTAGCR6G 
NAEY. FTSE GSYDDRP 

GCACTAGTA GACAAANTAGAAGNAGAAGAAGAAGAAGAANCCGN 

' ' I ' ' I ' ' I I . I I I I , [ 2520 

CGTGATCATCTGTTTNATCTTCNTCTTCTTCTTCTTCTTNGGCN 
ALVDK?.E?EEEEE?? 

Fig. 9 SHEET 10 
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10 

4. 



46/75 

1— 

20 
J. 



— r— 

30 
J. 



B-GATGGGBCCTTGAACTCAGCAATTTGACACTCAGT 
TliGATGGG-CCTTGAACTCAGCAATTTGACACTCAGT 
T flGATGGGBCCTTGAACTCAGCAATTTG ACACTCAGT 
IT" 



69 

70 

71 

7 

1 



208 

210 

210 

48 

1 



80^ 



9o" 



100 



tttttctcttaattccaaccaagg-aatgaataaaae 
tttttctcttaattccaaccaBggBaatgaataaaag 
tttttctcttaattccaaccaagg - aat gaataaaag 

IaaBag 




150 



160 



170 



1 38 GAAAGATGGTGTAT AC ACTCTCTGGAGTTCGTTTTCC 

140 gaaagatggtgtataDactctctggagttcgttttcc 

1^0 gaaagatggtgtatacactctctggagt tcgttttcc 

33 B^^B^^BBB^BKBici 
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220 



230 



240 
J. 



cagcagtaatggtgatcggaggaatgctaatBtttct 
cagcagtaatggtgatcggaggaatgctaatgtttct 

ca gcagtaatggtgatcggagg aatgctaatgtttr.t 

CA t y 

IGBATGCTAATGTTTCT 




290 



300 



310 




278 ATCTTGGCTGAAAAGTCTTCTTACAATTCCGAATtecC 

280 ATCTT6GCT6AAAA6TCTTCTTACAATTCCGAATTCC 

280 ATCTTGGCTGAAAAGtCTTCTTACAATTCCGAATTCC 

57 ATCTTG6CTGAAAAGTCTTCTTACAATTCCGAATTCC 

50 ATCTTGGCTGAAAAGTCTTCTTACAATTCCGAAT^CC 
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40 
1- 



— r- 

50 



6o" 



70 



TAGTTACACT@CHATCACTTATCAGATCTCTAT 10con. seq 
TAGTTACACTCCTATCACTTATCAGATCTCTAT 11 con. seq 
TAGTTACACTCCT ATC ACTTATCAGATCTCTAT 19con. seq 
JATOaGBBBBB^^S 86C0N. SEQ 

pcrsbe2con. s.eq 




no 

J. 



"l20 



130 



TJo 

J. 



GATAGATTTGTAAAAACCCTAAGGAGAGAAGAA lOcon. seq 
GATAGATTTGTAAAAACCCTAAGGAGAGAAGAA llcon. seq 
GATAGATT T6TAAA AACCCTAAGGAG AGAAGAA IQcon, seq 
GAgAQATlBBSAACfBlSAGRA^^B 86C0N. SEQ 

pcrsbe2con. seq 




180 



190 



200 



210 




TACTGTTCCATCAGTGTACAAATGTAATGGATT 10con. seq 
TACTGTTCCATCAGTGTACAAATCTAATGGATT llcon. seq 
TACTGTTCCATCA GTGTACAAATCTAATGGAT T 19con. seq 

'!ac^tBcaBca^H^^^B^^Bt 86con. seq 

pcrsbe2con. seq 



250 260 270 280 

GTATTCTTGAAAAABCACTCTCTTTCACGGAAG lOcon. seq 
GTATTCTTGAAAAAGCACTCTCTTTCACGGAAG llcon. seq 
GTATTCTTGAAAAAGCACT.CTCT.TT CACGGAAG 19con. seq 

flBDBBEHBBDBBBHBncABGGflBG secoN. seq 

GTATTCTTGAAAAAGCACTCTCTTTCACGGAAG pcrsbe2con. seq 



320 



330 



340 

JL 



350 



GACCTTCTACAQTTGCAGCATCGGGGAAAGTCC 
GACCTTCTACAGTTGCAGCATCGGGGAAAGTCC 
GACCTTCTACAGTTGCAGCATCGGGGAAAGTCC 
GACCTTCTACAGTTGCAGCATCGGGGAAAGTCC 86C0N. SEQ 
GACCTTCTACAGTTGCAGCATC6GGGAAA6TCC pcrsbe2con. seq 



lOcon. seq 
1 Icon, seq 
19con. seq 
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360 



37o" 



380 
J. 



348 ttgtgcctggaaflccagagtgatagctcctcatcctc 

350 ttgtgcctggaacccagagtgatagctcctcatcctc 

350 ttgtgcctggaacccagagtgatagctcctcatcctc 

127 ttgtgcctggaacccagagtgatagctcctcatcctc 

1 20 ttgtgcctggaaDccagagtgatagctcctcatcctc 



430 
J. 



440 



450 

JL 



4 1 8 agaaaattccccagcatcaactgatgtagatagttca 

420 agaaaattccccagcatcaactgatgtagatagttca 

420, agaaaattccccagcatcaactgatgtagatagttca 

197 agaaaattccccagcatcaactgatgtagatagttca 

190 agaaaattccccagcatcaactgatgtagatagttca 



500 
J. 



510 



520 



488 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

490 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

490 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

267 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 

260 AACGATGACGTTGAGCCGTCAAGTGATCTTACAGGAA 



.Fig. 10 
Sheet 4 



STo" 

X 



580 

X 



590 

X 



558 AACTACAAGAAGGTGGTAAACtGGAGGAGTCTAAAAC 

560 AACTACAAGAAGGTGGTAAACTGGAG6AGTCTAAAAC 

560 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 

337 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 

330 AACTACAAGAAGGTGGTAAACTGGAGGAGTCTAAAAC 



640 

X 



650 

X 



660 

X 



628 ATCTGATAGGATCAGAGAGAGGGGCATCCCTCCACCT 

630 ATCTGATAGGATCAGAGAGAGGGGCATCCCTCCACCT 

630 ATCTGATAGGATCAGAGAGAGGGGCATCCCTCCACCT 

407 ATCTGATAGGATCAGAGAGAGGGGCATCCCTCCACCT 

400 ATCTGATAGGATCAGAGAGAGGGGCATCCCTCCACCT^ 
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— I -T— 1 r 

390 400 410 420 

AACAGAOCAATTTGAGTTCBCTGAGACATCTCC lOcon. sea 
AACAGACCAATTTGAGTTCACTGAGACATCTCC 11 con. seq 
AACAGACCAATTTGAGTTCACTGAGACATCTCC 19con. seq 
AACABACCAATTTGAGTTCACTGAGACATCTCC .86C0N. SEQ 
AACAGACGAATTTGAGTTCACTGAGACATCTCC pcrsbe2con seq 



^60 470 480 490 

ACAAT6GAACACGCTAGCCAGATTAAAACTGAG lOcon seq 
ACAATGGAACACGCTAGCCAGATTAAAACTGAG llcon. seq 
ACAATGGAACACGCTAGCCAGATTAAAACTGAG 19con. seq 
ACAATGGAACACGCTAGCCAGATTAAAACTGAG 86C0N SEQ 
ACAATGGAACACGCTAGCCAGATTAAAACTGA6 pcrsbe2con. seq 

530 540 550 560 

GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC lOcon. seq 
GTGTTGAAGAGCTGGATTTT6CTTCATCACTAC llcon. seq 
GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC 19con. seq 
GTGTTGAAGAGCTGGATTTT6CTTCATCACTAC .86C0N. SEQ 
GTGTTGAAGAGCTGGATTTTGCTTCATCACTAC .pcrsbe2con. seq 

690 610 620 630 

ATTAAATACTTCTGAAGAGACAATTATTGATGA lOcon seq • 
ATTAAATACTTCTGAAGAGACAATTATTGATGA 1 1 con ' seq 
ATTAAATACTTCTGAAGAGACAATTATTGATGA 19con. seq 
ATTAAATACTTCTGAAGAGACAATTATTGATGA 86C0N. SEQ 
ATTAAATACTTCTGAAGAGACAATTATTGATGA pcrsbe2con. seq 

1— 1 1 r 

670 ■ 680 690 700 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC lOcon. seq 

GGACTTGGTCAGAAGATTTATGAAATAGACCGC 1 Icon, seq 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC 19con seq 

GGACTTG6TCAGAAGATTTATGAAATAGACCCC 86C0N. SEQ 

GGACTTGGTCAGAAGATTTATGAAATAGACCCC pcrsbe2con. seq 
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710 



720 



730 



698 CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 

700 CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 

700 CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 

477 CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 

470 CTTTTGACAAACTATCGTCAACACCTTGATTACAGGT 



780 



790 

JL 



800 

JL 



768 acaagtatgagggtggtttggaagctttttctcgtgg 

770 acaagtatgagggtggtttggaagcHttttctcgtgg 

770 acaagtatgagggtggtttggaagcfflttttctcgtgg 

547 acaagtatgagggtggtttggaagctttttctcgtgg 

540 acaagtatgagggtggtttggaagctttttctcgtgg 



850 



860 



"sTO 

JL 



838 aggtatcacttaccgtgagtgggctcctggtgcccag 

839 aggtatcacttaccgtgagtgggctcctggtgcccag 

840 aggtatcacttaccgtgagtgggctcQtggtgcccag 
617 aggtatcacttaccgtgagtgggctcctggtgcccag 
610 aggtatcacttaccgtgagtgggctcctggtgcccag 

1 1 1 

920 930 940 

908 GACGCAAATGCTGACflTTATGACTCGGAATGAATTTG 

909 GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 

910 GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 
687 GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 
680 GACGCAAATGCTGACATTATGACTCGGAATGAATTTG 



,Flg.lO 
Sheets 



990 

JL 



1000 



1010 

JL 



978 ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAA 

979 ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAA 

980 ATGGTTCTCCTGCAATTCCTCATGGGTCCAGA6TGAA 
757 ATGGTTCTCCTGCAATTCCTCATGGGTCCAGAGTGAA 
750 ATGGTTCTCCTGCAATTCCTCATGGGTCCA6AGTGAAJ 
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5V75 

740 750 760 770 

ATTCACAGTACAAGAAACTGAGGGAGGCAATTG lOcon. seq 
ATTCACAGTACAAGAAACTGAGGGAGGCAATTG 1 1con, seq 
ATTCACAGTACAAGAAACTGAGGGAGGCAATTG 19con. seq 
ATTCAGAGTACAAGAAACTGAGGGAGGCAATTG 86C0N. SEQ 
ATTCACA6TACAA6AAACTGAGG6AGGCAATTG pcrsbe2con. seq 

1 1 : — I r 

810 820 830 840 

TTATGAAAgAATGGGTTTCACTCGTAGTGCTAC lOcon. seq 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC 11con. seq 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC 19con. seq 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC 86C0N. SEQ 

TTATGAAAAAATGGGTTTCACTCGTAGTGCTAC pcrsbe2con. seq 

— I 1 1 r 

880 890 900 910 

TCAGCTGCCCTCATTGGgGATTTCAACAATTGG 10con. seq 

TCAGCTGCCCTCATTGGAGATTTCAACAATTGG 11 con. seq 

TCAGCTGCCCTCATTGGAGATTTCAACAATTGG 19con. seq 

TCAGCTGCCCTCATTGGAGATTTCAACAATTGG 86C0N. SEQ 

TCAGCTGCCCTCATTGGA6ATTTCAACAATT6G pcrsbe2cdn. seq 

— I , , P 

950 960 970 980 

GTGTCTGBGAGATTTTTCTGC.CAAATAATGTGG lOcon. seq 
GTGTCTGGGAGATTTTTCTGCCAAATAATGTGG 1 Icon, seq 
GTGTCTGGGAGATTTTTCTGCCAAATAATGTGG 19con. seq 
GTGTCTGGGAGATTTTTCTGCCAAAT.AATGTGG 86C0N. SEQ 
GTGTCTGGGAGATTTTTCTGCCAAATAATGTGG pcrsbe2c.on. seq 

— I 1 1 I 

.1020 1^0 1040 1050 

GATACGTATGGACACTCCATCAGGTGTTAAGGA lOcon. seq 
GATACGTATGGACACTCCATCAGGTGTTAAGGA 1 Icon, seq 
GATACGTATGGACACTCCATCAGGTGTTAAGGA 19con. seq 
GATACGTATGGACAGTCCATCAGGTGTTAAGGA 86C0N. SEQ 
GATACGDATGGACACTCCATCAGGTGTTAAGGA pcrsbe2con. seq 
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1060 



1070 



1080 



1048 TTCCATTCCTGCTTGGATCAACTACTCTTTACAGCTT 

1049 TTCCATTCCTGCTTGGATCAACTACTCTTTACAGCTT 

1050 TTCCATTCCTGCTTGGATCAACTACTCTTTACA6CTT 
827 TTCCATTCCTGCTTGGATCAACTA.CTCHTACAGCTT 
820 TTCCATTCC.TGCTTGGATCAACTACTCTTTACAGCTT 



1130 



1 140 



1150 
J. 



1 1 18 GATCCACCCGAAGAGGA6AGGTATATCTTCCAACACC 

1 1 19 GATCCACCCGAAGAGGAGAGGTATATCTTCCAACACC 

1 120 GATCCACCCGAAGAGGAGAGGTATATCTTCCAACACC 
895 GATCCACCCGAAGAGGAGAGGTATATCTTCCAACACC 
890 GATCCACCCGAAGAGGAGAGGTAT0TCTT.CCAACACC 



1200 
J. 



1210 



1220 



I 188 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 

1 189 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 

1 190 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 
965 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 
960 ATGAATCTCATATTGGAATGAGTAGTCCGGAGCCTAA 
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1260 
1035 
1030 



1270 



1 



1290 



1258 TCTTCCTeGCATAAAAAAHGCTTGGGTACAATGCGBT 

1259 tcttcctcgcataaaaaa-gcttgggtacaatgcgct 
tcttcctcgcataaaaaa-ggttgggtacaatgcgct 
tcttcctcgcataaaaaa-gcttgg6tacaatgcgct 
tcttcctcgcataaaaaa-BcttgggtacaatgcgBt 

3fe_ , ^ 



1340 



1350 



1360 

X 



1328 tgctagttttggttatcatgtcacaaatttttttgca 

1328 tgctagttttggttatcatgtcacaaatttttttgca 

1329 Bgctagttttggttatcatgtcacaaatttttttgca 
1 104 tgctagttttggttatcatgtcacaaatttttttgca 
1099 tgctagttttggttatcatgtcacaaatttttttgca . 
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i1090 1100 1V10 112 0 

CCTGATGAAATTCCATATAATGGAATATATTAT lOcon. seq 
CCTGATGAAATTCCATATAATGGAATATATTAT llcon. seq 
CCTGATGAAATTCCATATAATGGAATAHATTAT 19con. seq 
CCTGATGAAATTCCATATAATGGAATATATTAT 86C0N. SEQ 
CCTGATGAAATTCCATATAATGGAATATATTAT pcrsbe2con. seq 

— I • 1 1^ r 

1 160 1170 1180 1 190 

CACGGCCAAAGAAACCAAAGTCG@TGAGAATAT lOcon. seq 
CACGGCCAAAGAAACCAAAGTCGCTGAGAATAT 1 1 con. seq 
CACGGCCAAAGAAACCAAAGTCGCTGAGAATAT 1 ycon. sea 
CACGGCCAAAGAAACCAAAGTCGCTGAGAATAT 86C0N. SEQ 
CACGGCCAAAGAAACCAAA6TCGCTGAGAATAT pcrsbe2con. seq 

- — I 1 T r 

1230 1240 1250 1^0 

AATTAACTCATACGTGAATTTTAGAGATGAA6T lOcon. seq 
AATTAACTCATACGTGAATTTTAGAGATGAAGT llcon. seq 
AATTAACTCATACGTGAATTTTAGAGATGAAGT 19con. seq 
AATTAACTCATACGTGAATTTTAGAGATGAAGT 86C0N. SEQ 
AATTAACTCATACGTGAATTTTAGAGATGAAGT pcrsbe2con. seq 

I 1 1 r 

1300 1310 1320 1330 

GCAAATTATGGCTATTCAAGAGCATTCTTATTA lOcon. seq 
GCBAATTATGGCTATTCAAGAGCATTCTTATTA llcon. seq 
GCAAATTATGGCTATTCAAGAGCATTCTTATTA IGcon. seq 
GCAAATTATGGCTATTCAAGAGCATTCTTATTA 86C0N. SEQ 
GCAAATTATGGCTATTCAAGAGCATTCTTATTA pcrsbe2con. seq 

I 1 1 r 

1370 1380 1390 1400 

CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT lOcon. seq 
CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT llcon. seq 
CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT 19con. seq 
CCAAGCAGCCGTTTTGGAACGCCCGACGACCTT 86C0N. SEQ 
CCAAGCAGCCGTTTTGGAACGCCC6ACGACCTT pcrsbe2con. seq 
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wo 96/34968 PCT/GB96y0107S 



1410 



5^/75 



1420 



1430 



1 398 AAGTCTTTGATTGATAAAGCTC ATGAGCTAGGAATTG 

1398 AAGTCTTgGATTGATAAAGCTCATGAGCTAGGAATTG 

1399 AAGTCTTTGATTGATAAAGCTCAT6AGCTAG6AATTG 
1 174 AAGTCTTT6ATT6ATAAAGCTCATGAGCTA6GAATTG 
1 169 AAGTCTTTGATTGATAAA6CTCATGAGCTAG6AATTG 



1480 
J. 



1490 
J. 



1500 



1468 CAAATAATACTTTAGATGGACTGAACATGTTTGACGG 

1 468 CAAATAATACTTTAGATGGACTGAACATGTTTGACGG 

1 469 CAAATAATACTTTAGATGGACTGAACATGTTTGACflG 
1244 CAAATAATACTTTAGATGGACT6AACATGTTTGACGG 
1239 CAAATAATACTTTAGATGGACTGAACATGTTTGACGG 



1538 
1538 

1539 
1314 
1309 



1550 



1560 



1570 



tggttatcattggatgtgggattjiccgcctctttaac 
tggttatcattggatgtgggattBccgcctctttaac 

tggttatcattggatgtgggattcccgcctctttaac 
tggttatcattggatgtgggattcccgcctOtttaac 
tggttatcattggatgtgggattcccgcctctttaac 



, Fig. 10 
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1620 



1630 

X 



1640 

X 



1 608 tcaaatgcgagatggtggttggatgagttcaaatttg 
1607 tcaaatgcgagatggtggttggatgagttcaaatttg 

1609 tcaaatgcgagatggtggttggatgHgttcaaatttg 
1384 tcaaatgcgagatggtggttggatgagttcaaatttg 
1379 tcaaatgcgagatggtggttggatgagttcaaatttg 

r— — — 1 1 



1690 

X 



1 1700 1710 

1678 TGTgTACTCACCACGGATTATGGGTGGGATTCACTGG 
1677 TGTATACTCACCACGGATTATCGGTGGGATTCACTGG 

1679 TGTATAflTCACCACGGATTATCGGTGGGATTCACTGG 
1 454 TGTATACTCACCACGGATTATCGGTGGGATTCACTGG 
1 449 TGTATACTCACCACGGATTATCGGTGGGATTCACTGG 
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1440 1450 1460 147 0 

TTGTTCTCATGGACATTGTTCACAGCCATGCAT lOcon. seq 
TTGTTCTCATGGACATBGTTCACAGCCATGCAT llcon. seq 
TTGTTCTCATGGACATTGTTCACAGCCATGCAT 19con. seq 
TTGTTCTCATGGACATTGTTCACAGCCATGCAT 86C0N. SEQ 
TTGTTCTCATGGACATTGTTCACAGCCATGCAT pcrsbe2con. seq 

15 ^ 1520 1530 154 0 

CACBGATAGTTGTTACTTTCACTCTGGAGCTCG lOcon. seq 
CACCGATAGTTGTTACTTTCACTCTGGAGCTCG llcon. seq 
CACCGATAGTTGTTACTTTCACTCTGGAGCTCG 19con. seq 
CACCGATAGTTGTTACTTTCACTCTGGAGCTCG 86C0N. SEQ 
CACjGATAGTTGTTACTTTCACTCTGGAGCTCG pcrsbe2con. seq 

— I 1 1 r 

1580 1590 1600 1610 

TATGGAAACTGGGAGGTACTTAGGTATCTTCTC lOcon. seq 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC 1 1 con. seq 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC 19con. seq 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC 86C0N. SEQ 
TATGGAAACTGGGAGGTACTTAGGTATCTTCTC pcrsbe2con. seq 

— I 1 1 r 

1650 1660 1670 168 0 

ATGGATTTAGATTTGATGGTGTGACATCAATGA lOcon. seq 

ATGGATTTAGATTHGATGG.TGTGACATCAATGA llcon. seq 

ATGGATTTAGATTTGATGGTGTGACATCAATGA 19con. seq 

ATGGATTTAGATTTGATGGTGTGACATCAATGA 86C0N. SEQ 

ATGGATTTAGATTTGATGGTGTGACATCAATGA pcrsbe2con. seq 

— J 1 -T r 

1720 1730 1740 175 0 

GAACTACGAGGAATACTTT.GGACTCGCAACTGA lOcon. seq 
GAACTACGAGGAATACTTTGGACTCGCAACTGA llcon. seq 
GAACTACGAGGAATACTTTGGACTCGCAACTGA 19con. seq 
6AACTACGAGGAATACTTTGGACTCGCAACTGA 86C0N. SEQ 
GAACTACGAGGAATACTTTGGACTCGCAACTGA pcrsbe2con. seq 
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1760 



1770 



1780 



1748 
1747 
1749 
1524 
1519 



TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 
TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 
TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 
TGTGGATGCTGTTGTGTATCTGATGCTGGTCAACGAT 
TGTGGAT6CT6TTGTGTATCTGATGCTGGTCAACGAT 



1830 



— I — 

1840 
J. 



1850 
j- 



1818 ATTGGTGAAGATGTTAGCGGAATGCCGACATTTTGT@ 
1817 ATTGGTGAAGATGTTAGCGGAATGCCGACATTTTGTA 

1819 ATTGGTGAAGATGTTAGCGG AATGCCGACATTTTGTA 
1 594 ATTGGTGAAGATGTTA6CGGAATGCCGACATTTTGTA 
1 589 ATTGGTGAAGATGTTAGCGGAATGCCGACATTTTGTA 



1900 
J. 



1910 



1920 
J. 



1888 ATCGGCTGCATATGGCAATTGCTGATAAATGGATTGA lFig.10 
1887 ATCGGCTGCATATGGCAATTGCTGATAAATGGATTGA f Sheet 12 

1889 ATCGGCTGCATATGGCAATTGCTGATAAAgGGATTGA 
1664 ATCGGCTGCATATGGCAATTGCT6ATAAATGGATTGA 
1659 ATC6GCTGCATATG6CAATT6CTGATAAATGGATTGA 



1958 
1957 
1959 
1734 
1729 



1970 



1980 
J. 



1990 



GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 
GGGTGATATTGTTCATACACTGACAAATAGAAGATGG 



2040 



2050 

-L 



2060 



2028 gatcaagctctagtcggtgataaaactatagcattct 
2027 gatcaagctctagtcggtgataaaactatagcattct 

2029 gatcaagctctagtcggtgataaaactatagcattct 
1 804 gatc aagctctagtcggtgataaaactatagc attct 
1 799 gatcaagctctagtcggtgataaaactatagcatDct 
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— I 1 '• 1 r 

1790 1800 1810 1820 
— I I 1 1- _ 



CTTATTCATGGGCTTTTCCCAGATGCAATTACC lOcon. seq 

CTTATTCATflGGCTTTTCCCAGATGCAATTACC llcon. seq 

CTTATTCATGGGCTTTTCCCAGATGCAATTACC 19con. seq 

CTTATTCATGGGCTTTTCCCAGATGCAATTACC 86C0N. SEQ 

CTTATTCAHGGGCTTTTCCCAGATGCAATTACC pcrsbe2con. seq 

— I 1 1 r 

1860 . 1870 1880 1890 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT lOcon. seq 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT llcon. seq 

TTCCCGTHCAAGABGGGGGTGTTGGCTTTGACT 19con. seq 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT 86C0N. SEQ 

TTCCCGTTCAAGATGGGGGTGTTGGCTTTGACT pcrsbe2con. seq 

1930 1940 1950 196 0 

GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT lOcon. seq 
GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT llcon. seq 
GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT 19con. seq 
GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT 86C0N. SEQ 
GTTGCTCAAGAAACGGGATGAGGATTGGAGAGT .pcrsbe2con. seq 

2000 2010 2020 203 0 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT lOcon. seq 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT llcon. seq 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT 19con. seq 

TCGGAAAAGTGTGTTTCATACGCTGAAAGTCAT 86C0N. SEQ 

TCGGAAAAGTGTGTTTCATECGCTGAAAGTCAT pcrsbe2con. seq 

— , , , P 

2070 2080 2090 210 0 

GGCTGATGGACAAGGATATGTATGATTTTATGG 10con. seq 

GGCTGATGGACAAGGATATGTATGATTTTATGG llcon. seq 

GGCTGATGGACAAGGATATGTATGATTTTATGG 19con. seq 

GGCTGATGGACAAGGATATGTATGATTTTATGG 86C0N. SEQ 

GGCTGATGGACAAGGATATGTATGATTTTATGG pcrsbe2con. seq 
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2110 
J. 



2\20 

X 



2130 



2098 ctctggatagaccgtcaacatcattaatagatcgtgg 
2097 ctctggatagaccggcaacatcattaatagatcgtgg 

2099 ctctggatagaccgtcaacatcattaatagatcgtgg 
1874 ctctggatagaccgHcaacatcattaatagatcgtgg 
1869 ctctggatagaccgDcaacaDcattaatagatcgtgg 



2168 

2167 
2169 
1944 
1 939 



it 



2180 

X 



2190 

X 



2200 



tatgggattaggaggagaagggtacctaaatttcatg 
tatgggattaggaggagaagggtagctaaatttcatg 
tatgggattaggaggagaagggtacctaaatttcatg 
tatgggattaggaggagaagggtacctaaatttcatg 
tatgggattaggaggagaagggtacctaaatttcatg 



2238 
2237 
2239 
2014 
2009 



2308 
2307 
2309 
2084 
2079 



2250 

X 



2260 

X 



2270 



ttccctagggctgaaeaacacctctctgatggctcag 
ttccctagggctgaBcHacacctQtctgatggctcag 
ttccctagggctgaacaacacctctct6atggctcag 
ttccctagggctgaacaacacctctctgatgHctcag 
ttccctagggctgy^aacacctctctgatggctcag 



2320 

X 



2330 



2340 



, Fig.lO 
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GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 
GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 
GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 
GGAGACGGAGATTTGACCTGG6AGATGCAGAATATTT 
GCAGACGGAGATTTGACCTGGGAGATGCAGAATATTT 



2390 

X 



— I — 

2400 

X 



2410 

X 



2378 TATGCAGTATCTTGAAGATAAATATGAGTTTATGACT 
2377 TATGCAGTATCTTGAAGATAAATATGAGTTTATGACT 

2379 TATGCAGTATCTTGAAGATAAATATGAGTTTATGACT 
2 1 54 TATGCAGTATCT.TGAAGATAAATAT6AGTTTAT6ACT 
2 1 49 T ATGCAGTATCTTGAAGAT AAATATG AGTTTATGACT 
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— I 1 1 r 

2140 2150 2160 217 0 

GATAGCATTBCACAAGATGATTAGGCTTGTAAG lOcon. seq 

GATAGCATTGCACAAGATGATTAGGCTTGTAAG 11 con. seq 

GATAGCATTGCACAAGATGATTA66CTTGTAAC 19con. seq 

GATAGCATT6CACAAGATGATTAGGCTTGTAAC 86C0N. SEQ 

GATAGCATTGCACAAGATGATTAGGCTTGTAAC pcrsbe2con. seq 

— I r — : 1 — r 

2210 2220 2230 224 0 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT lOcon. seq 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT 1 Icon, seq 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT 19con. seq 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT 86C0N. SEQ 

GGAAATGAATTCGGCCACCCTGAGTGGATTGAT pcrsbe2con. seq 

— I 1 1— r 

2280 2290 2300 231 0 

TAATTCCC0GAAACCAATTCAGTTATGATAAAT lOcon. seq 

TAATTCCCGGAAACCAATTCAGTTATGATAAAT 1 1con, seq 

TAATBCCCGGAAACCAATTCAGTTATGATAAAT 19con. seq 

TAATTCCCGGAAACCAATTCAGTTATGATAAAT 86C0N. SEQ 

TAATTCCCGGAAACCAATTCAGTTATGATAAAT pcrsbe2con. seq 

1 1 1 r 

2350 2360 23/0 238 0 

AAGATACCGTGGGTTGCAAGAATTTGACC.GGGC lOcon. seq 
AAGATACCBTGGGTTHCAAGAATTTGACQGGGC llcon. seq 
AAGATACCGTGGGTTGCAAGAATTTGACCGGflC 19con. seq 
AAGATACCGTGGGTTGCAAGAATTTGACCGGGC 86C0N. SEQ 
AAGATACCHTGGGTTGCAAGAATTTGACCGGGC pcrsbe2con. seq 

— I I — : 1 r 

2420 2430 .2440 245 0 

TCAGAACACCAGTTCATATCACGAAAGGATGAA lOcon. seq 
TCAGAACACCAGTTCATATCACGAAAGGATGAA llcon. seq 
TCAGAACACCAGTTCATATCAC6AAAGGATGAA 19con. seq 
TCAGAACACCAGTTCATATCACGAAAGGATGAA 86C0N. SEQ 
TCAGAACACCAGTTCATATCACGAAAGGATGAA pcrsbe2con. seq 
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2448 
2447 
2449 
2224 
2219 



2460 

JL 



2470 ^ 24_80 



ggagataggatgattgtatttgaaaaaggaaacctag 
ggagataggatgattgtatttgaaaEaggaaacctag 
gga6ataggatgattgtatttgaaaaaggaaacctag 

GGAGATAGGATGATTGTATTTGAAAAAGGAAACCTAG 
GGAGATAGGATGATTGTATTTGAAABAGGAAACCTAG 



2530 



2518 
25i7 
2519 
2294 
2289 



2540 



2550 



ATTCAGACTATCGCATAGGCTGCCTGAAGCCTGGAAA 
ATTCAGACTATCGCATAGGCTGCCTGAAGCCTGGAAA 

attcagactatcgcatagBctgcctgaagcctggaaa 
attcagactatcgcataggctgcctgaagcctggaaa 
attcagactatcgcataggctgcctgaagcctggaaa 



2600 
J. 



2610 
J. 



2620 



2588 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 
2587 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 

2589 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 
2364 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 
2359 TTTTGGTGGCTTCGGGAGAATTGATCATAATGCCGAA 



Fig. 10 
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2670 



2680 
-1. 



690 



2658 CCTCGTTCAATTATGGTGTATGCACCTAGTAGAACAG 
2657 CCTflGTTCAATTATGGTGTATGCACCTAGTAGAACAG 

2659 CCTCGTTCAATTATGGTGTATGCACCTii6TA0AACAG 
2434 CCTCGTTCAATTATGGTGTATGCACCTUGTAGAACAG 
2429 CCTCGTTCAATTATGGTGTATGCACCTAGTAGAACAG 



2740 



2750 



2760 



2722 
2722 
2729 
2501 



iAAGAAGAAGAAGAAGAAGAAGTAGCAGTAGT 

lAGAAGTAGCAGTAGT 



aagaagaagaagaagaagaagaagaagtagcagBagt 
aagaagaagaagaagaagaagaagaagtagcagtagt 

2499 BaGAAGAAGAAGAAGAAI 







Fig. 10 SHEET 15 



SUBSTITUTE SHEET (RULE 26) 



wo 96^968 



PCT/GB96/0107S 



61/75 



— , , 1 -r 

2490 2500 2510 ^ 252 0 

TTTTTGTCTTTAATTTTCACTGGACAAAAWGCT lOcon. seq 

TTTTHGTCTTTAATTTTCACTGGACAAABAGCT 1 1con, seq 

TTTTTGTCTTTAATTTTCACTGGACAAAAAGCT 19con. seq 

TTTTTGTCTTTAATT.TTCACTGGACAAAAAGCT 86C0N. SEQ 

TTTTTGTCTTTAATTTTCACTGGACAAAOAGCT pcrsbe2con. seq 

m 

— I 1 1 r 

2560 2570 2580 259 0 

ATACAAGGTTGCCTTGGACTCAGATGATCCACT 10con. seq 
ATACAAGGTTGQCTTGGACTCAGATGATCCACT llcon. seq 
ATACAAGGTTGCCTTGGACTCAGATGATCCACT 19con. seq 
ATACAAGGTTGCCTTGGACTCAGATGATCCACT 86C0N. SEQ 
ATACAAGGTTGgCTTGGACTCAGATGATCCACT pcrsbe2con. seq 

— I 1 1 r 

2630 -ji^S^O ^2650 266 0 

TATTTCACCTTTGAAGGATGGTATGATGATCGJ lOcon. seq 
TATTTCACCTHTGAAGGATHGTATGATGATCGT llcon. seq 
TATTTCACCTTTGAAGGATGGTATGATGATCGT 19con. seq 
TATTTCACCTTTGAAGQATGGTATGATGATCGT 86G0N. SEC 
TATTTCACCTBJGAAGGATBGTATGATGATCGT pcrsbe2con. seq 

— T ^ , r 

2700 2710 2720 2730 

CAGTGGTCTATGCACTAG1AGAIAAAuB--B| Idcon. seq 
CAGIGGTCTATGCACTAGTAGACAAAMT-JBI llcon. seq. 
CAGTGGTCTATGCACTAGTAGACAAAGE^AAG 19con. seq 
CAGTGGTCTATGCACTAGTAGACAAAGBRAAG 86C0N. SEQ 
CAGTGGTCTATGCACTAGTAGACAAAEraAAG pcrsbe2con. seq 

— ^ , r 

2770 2780 2790 280 0 

AGAAGAAGTAGTAGTAGAAGAAGAATGAACGAA lOcon. seq 
AGAAGAA fgira^ TflG— AAGAATGAACGAA llcon. seq 
AGAAGAAGTAGTAGTAGAAGAAGAATGAACGAA 19con. seq 
AGAAGAAGTAGTAGTA GAAGAAGAAT GAACGAA 86C0N. SEQ 
gBBBBflBBiB3(^ISISQAAGAATBB3B pcrsbe2con. seq 
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2810 



2820 



2830 



g-u g^u ^iOOU 

57QQ ^Ji^3^^I£GCGTTGAAAGATTTGAACGrACDT@GB 
2571 ctIgTGIIBHH^^ 



2529 



2856 



2880 



2890 



2900 



PfiPQ rllrrrrr A^JS^'^ ' ^ ' ^^ACAACA-GGTITGCAATT 

9fl«Q ^^^S^^^^^^^S^a'^g'tgacaacaBggtttgcaBtt 

2869 CTTGGCGGAATTTr ATGTGACAHgA-PnTTTnrZfl T 
2529 




. 2950 2960 2970 

2899 rJrJIpA^ni^^iS^^^^A^CATATGTAAAATCGA 
pq?R p^paI^^^SI^^^^^A^AAA-CATATGTAAAATCGA 

111^ gagatgaa_gtgctgaacaaa--catatgtaaaatcga 

2529 



3020 3030 

2995 CCTGCAG--- r 

2967 CCTGCAG r 

3006 C C T G C A G gHW rielelclcli l ilil ilih glSBmr ti 
2576 »m iimimiiif niiLii 

2529 _ ___ mj^ 
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2840 



2850 



2860 



2870 



GCTTCTTGACGTA TCTG GC AATA TTGCATjjAGT lOcon. seq 
BlBBTlggACaTA eHag GC iiigl Tg^CATCAGT 1 1 con. seq 
GCTTCTTGACGTATCTGGCAATATTGCATCAGT 19con. seq 

.86C0N. SEQ 
pcrsb.e2con. seq 




2910 



2920 



2930 
J. 



2940 



CTTTCCACTATTA6TAGTGCAACGATATAC6CA lOcon. seq 
CTTTCCACTATTAGTAGTHCAHCGATATACGCA llcon. seq 
CTTTCCACTATTAGTAGTGCAACGATATACGCA TQcon. seq 

86C0N. SEQ 
pcrsbe2con. seq 




2980 



2990 
a. 



3000 

JL 



3010 



TGAATTTATGTCGAATGCTGGGAGGATCGAATT lOcon. seq 
TGAATTTATGTCGAATGCTGGGACGATCGAATT llcon. seq 
TGAATTTATGTCGAATGCTGGGAC6ATCGAATT 19con. seq . 

86C0N. SEQ 
pcrsbe2con. seq. 





10con. seq 
1 Icon, seq 
19con. seq 
86C0N. SEQ 
pcrsbe2eon. seq 
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